Closed
Bug 764640
Opened 12 years ago
Closed 11 years ago
Sending data too quickly through the JSBridge socket causes an overflow and lots of event data can get lost (expect_assert.js, screenshot.js)
Categories
(Testing Graveyard :: Mozmill, defect)
Testing Graveyard
Mozmill
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: whimboo, Assigned: whimboo)
References
Details
(Keywords: dataloss, Whiteboard: [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1])
Attachments
(4 files, 3 obsolete files)
When I run mozmill master's mutt tests for the assertions lib, I get zero test results returned. ERROR | ProcessManager UNABLE to use job objects to manage child processes ERROR | INFO | ProcessManager NOT managing child processes INFO | DEBUG | mozmill.startRunner | true DEBUG | mozmill.setModule | "currentModule" DEBUG | mozmill.setState | "currentState" TEST-START | z:/data/code/tools/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js | testExpect [..] RESULTS | Passed: 0 RESULTS | Failed: 0 RESULTS | Skipped: 0 All tests were successful. Ship it! Flagging for 2.0 for now until this has been investigated.
Comment 1•12 years ago
|
||
FWIW, the ERROR | output is from outputting to stderr in mozprocess which mozlogger converts to an error line: https://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/logger.py#L86 Unrelated to the 0 results but worth noting
Assignee | ||
Comment 2•12 years ago
|
||
This is related to the throws/doesNotThrow tests in the test module.
Blocks: 637880
+'ing for investigation.
Whiteboard: [mozmill-2.0?] → [mozmill-2.0+]
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•12 years ago
|
||
I have re-setup a new Windows XP machine and I can always see this failure. Other versions of Windows are seem to be fine. At least on Windows 7 I'm not able to reproduce.
Assignee | ||
Comment 5•12 years ago
|
||
This is clearly a failure in the framework. In some cases we fail to send the endTest event so the Python code doesn't pick up the pass/fail/skip count. More details later.
Assignee | ||
Comment 6•12 years ago
|
||
This attachment contains the JSON code which we fail to send via JSBridge. I can get rid of the failure with my patch on bug 781376, which means that something invalid is in this object. I will further investigate tomorrow.
Comment 7•12 years ago
|
||
Interesting. Now that bug 781376 has been resolved I assume we'd see this issue much less often, but it would be great to find out the cause.
Assignee | ||
Comment 8•12 years ago
|
||
We would if an assertion fails with the specific data in the details. So we can simply negate the test to make it happen. So it shoudl still be reproducible 100% of the time. I will continue to work on this bug after bug 761564 is fixed. If you want to check feel free to take some work out of my plate. Thanks.
Updated•11 years ago
|
Whiteboard: [mozmill-2.0+] → [mozmill-2.0+] s=130408 u=failure c=mozmill p=1
Updated•11 years ago
|
Assignee: hskupin → andrei.eftimie
Comment 9•11 years ago
|
||
Trying to figure this one out.
When running python tests we get something like this:
> RESULTS | Passed: 1
> RESULTS | Failed: 0
> RESULTS | Skipped: 0
Running any js tests fails to print any "RESULTS", the whole block is missing.
Is this intended?
If so, this bug might not be relevant anymore.
If not, we should file another bug for re-enabling the RESULTS block for JS tests.
Flags: needinfo?(hskupin)
Assignee | ||
Comment 10•11 years ago
|
||
If we don't print the final results it's certainly worth another bug we should fix first.
Flags: needinfo?(hskupin)
Comment 11•11 years ago
|
||
I've set up a WinXP box with mozmill 2.0, but I am unable to reproduce this. According to comment 8 we should be able to reproduce the problem by 'negating the test': > So we can simply negate the test to make it happen. So it shoudl still be reproducible 100% of the time. While running the assertions/expect_assert.js test I always get 2 tests logged. I've tried changing the TEST_DATA objects or the conditions from within the tests to make them fail. So I either get 2 PASSED test or 2 FAILED tests. Henrik, is there something particular that needs "negating" here?
Flags: needinfo?(hskupin)
Assignee | ||
Comment 12•11 years ago
|
||
I'm still seeing this with latest Mozmill code against the Nightly build by running mozmill -b ../bin/nightly/firefox.exe -m mutt/mutt/tests/js/assertions/tests.ini
Flags: needinfo?(hskupin)
Assignee | ||
Comment 13•11 years ago
|
||
This time I see this on Windows 7, so it's not XP only.
Comment 14•11 years ago
|
||
Thanks Henrik, I can also see it on Win7; XP works fine for me. Finally a starting point.
Assignee | ||
Updated•11 years ago
|
Whiteboard: [mozmill-2.0+] s=130408 u=failure c=mozmill p=1 → [ateamtrack: p=mozmill q=2013q2 m=1][mozmill-2.0+]
Assignee | ||
Updated•11 years ago
|
Whiteboard: [ateamtrack: p=mozmill q=2013q2 m=1][mozmill-2.0+] → [ateamtrack: p=mozmill q=2013q2 m=0][mozmill-2.0+]
Comment 15•11 years ago
|
||
An update to what I have found so far. frame.js is firing the mozmill.endTest event here: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/modules/frame.js#L442 but sometimes the endTest method which would append the results to the results list here: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L69 is not run It might also be that this is indeed fired, but to late, (testrun finishes, handler gets destroyed before receiving all data?), but I have no proof to support this. If this is the case we might need to correctly wait for all reporting to finish before generating the final report.
Assignee | ||
Comment 16•11 years ago
|
||
You should add debugging statements to both sides to check which events are getting fired at which times. That helped me a lot in the past.
Comment 17•11 years ago
|
||
I've added time to my print/dump statements. Formatted Hour:Minute::Second:Milisecond From 2 tests - randomly - none, one or both would be logged. Flow: frame.js => fires the endTest event frame.js => takes the event and fires it through Events.jsm mozmill/__init__.py :: endTest_listener => receives the event and appends test results When our problem occurs the python never triggers the endTest_listener method, which might indicate a problem in Events.jsm The times look a bit off, but I suspect JS and Python dates are not completely in sync? Here is a dump where no test was logged: > ←[1;34mTEST-START←[0m | expect_assert.js | testExpect > > 11:16::27:410 FIRE endTest EVENT (frame.js) > > 11:16::27:410 mozmill.endTest (frame.js => Events.jsm) > 11:16::27:428 FIRE endTest EVENT (frame.js) > > > 11:16::27:817000 final reporting > > RESULTS | Passed: 0 > RESULTS | Failed: 0 > RESULTS | Skipped: 0 > 11:16::27:428 mozmill.endTest (frame.js => Events.jsm) Here is a dump where one test was logged: > ←[1;34mTEST-START←[0m | expect_assert.js | testExpect > > 11:18::26:882 FIRE endTest EVENT (frame.js) > > 11:18::26:882 mozmill.endTest (frame.js => Events.jsm) > 11:18::26:916 FIRE endTest EVENT (frame.js) > > ←[1;32mTEST-PASS←[0m | expect_assert.js | testAssert > > 11:18::26:910000 APPEND TEST RESULTS (mozmill/__init__.py => endTest_listener) > > > TEST-END | expect_assert.js | finished in 64ms > > 11:18::27:331000 final reporting > > RESULTS | Passed: 1 > RESULTS | Failed: 0 > RESULTS | Skipped: 0 > 11:18::26:918 mozmill.endTest (frame.js => Events.jsm) I am not sure how we should proceed further.
Assignee | ||
Comment 18•11 years ago
|
||
If the Python side doesn't get the endTest event, you will have to add debugging statements to jsbridge code. That will show you in detail what's getting transferred or where we bail out with a failure.
Comment 19•11 years ago
|
||
I've dug around deeper and I've reached the point where we open a Socket create a buffer with the message and send it: https://github.com/mozilla/mozmill/blob/master/jsbridge/jsbridge/extension/resource/modules/Sockets.jsm#L64 In both cases (when we fail and when we succeed) all parameters are equal (except for the timestamp in the message and - probably - the equivalent in the buffer). Attaching output with dump statements for when we see both tests correctly logged, and when we see only 1. I'm not sure how to monitor what gets sent through the socket. **Note. I have noticed that the more stuff I dumped to the console, the harder was it to get a failure. With the dumps seen in the attachements I had no results with 0 and one out of 20 runs with 1 result. Without any dumps the ratio would have been roughly 1 in 6 and 1 in 3. I could infer that this would add some delay that would make the test not fail to report. But nothing conclusive.
Comment 20•11 years ago
|
||
Comment 21•11 years ago
|
||
Henrik, any idea on how we could proceed? Based on what I've found the problem seems to lie within the Socket communication. I'm thinking of setting up some sort of proxy to see all communication, but I'm not that it would be feasible - or even possible. Have not tried to eavesdrop on a Socket until now.
Flags: needinfo?(hskupin)
Assignee | ||
Comment 22•11 years ago
|
||
Would you mind to reduce the assertion test? It might be that a specific check in this test and its result cause this awkward behavior. If there is such a check it would be easier for us to track down why it's failing.
Flags: needinfo?(hskupin)
Comment 23•11 years ago
|
||
(In reply to Andrei Eftimie from comment #21) > Henrik, any idea on how we could proceed? > Based on what I've found the problem seems to lie within the Socket > communication. > > I'm thinking of setting up some sort of proxy to see all communication, but > I'm not that it would be feasible - or even possible. Have not tried to > eavesdrop on a Socket until now. If you put dump statements in jsbridge's recv/send code you can see all the information going over the socket. jsbridge/extension/resource/modules/Sockets.jsm
Assignee | ||
Comment 24•11 years ago
|
||
Andrei, have you seen my comment 22? Can you please re-check by reducing the assertion test?
Flags: needinfo?(andrei.eftimie)
Comment 25•11 years ago
|
||
I am unable to reproduce this anymore. Same machine, updated FF and Mozmill. Older FF versions still have this problem. With tinderbox builds I have found the following pushlog which fixes the problem: http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=eeb89381c7a3 This is a merge from birch. Not sure what could have fixed this. We have some changes to a socket but they are advertised as Unix only (and we only had this problem in windows): http://hg.mozilla.org/mozilla-central/rev/6c52203a47b8 Since our problem appeared to be in a Socket connection, might this have fixed it? Henrik, can you please see if you can reproduce this with the latest Nightly?
Flags: needinfo?(andrei.eftimie) → needinfo?(hskupin)
Assignee | ||
Comment 26•11 years ago
|
||
(In reply to Andrei Eftimie from comment #25) > With tinderbox builds I have found the following pushlog which fixes the > problem: > http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=eeb89381c7a3 > > This is a merge from birch. That doesn't make any sense given that it was a merge from mozilla-central to birch and not the other way around. Nothing in here should have changed the behavior for Nightly or a tinderbox build for Nightly. I tried to run the test in my Win7 VM but it was not successful given the failure: "message": "redeclaration of var Cu", "fileName": "resource://mozmill/stdlib/securable-module.js -> resource://mozmill/modules/assertions.js", "name": "TypeError", "lineNumber": 7 Do you see the same? It's only happening for me on Windows with latest code of Mozmill but not on Linux.
Flags: needinfo?(hskupin)
Comment 27•11 years ago
|
||
I have no failures on Win 7.
Double checked that I am running latest versions of mozmill-env, mozmill, Nightly.
> $ mozmill -b /c/Mozilla/Nightly/firefox-24.0a1.en-US.win32/firefox/firefox.exe - m /c/work/mozmill/mutt/mutt/tests/js/assertions/tests.ini
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testExpect
> TEST-START | expect_assert.js | testAssert
> TEST-PASS | expect_assert.js | testAssert
> TEST-END | expect_assert.js | finished in 154ms
> RESULTS | Passed: 2
> RESULTS | Failed: 0
> RESULTS | Skipped: 0
>
> svuser@PXXXWIN7VM ~
> $ mutt -b /c/Mozilla/Nightly/firefox-24.0a1.en-US.win32/firefox/firefox.exe -m / c/work/mozmill/mutt/mutt/tests/js/assertions/tests.ini
> Running python tests
>
> ----------------------------------------------------------------------
> Ran 0 tests in 0.000s
>
> OK
> Running JS Tests
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testExpect
> TEST-START | expect_assert.js | testAssert
> TEST-PASS | expect_assert.js | testAssert
> TEST-END | expect_assert.js | finished in 64ms
> ===========================================================================
> Python Failures:
>
> ===========================================================================
> Javascript Failures:
>
> Total passed: 2
> Total failed: 0
> Total skipped: 0
================
I got that pushlog by opening "about:buildconfig" in the first tinderbox build that didn't reproduce the problem, and clicked on the link "built from"
Comment 28•11 years ago
|
||
If we can't repro anymore than mark as WFM?
Comment 29•11 years ago
|
||
Hopefully yes. I want someone has has previously seen this to confirm that it isn't happening anymore.
Comment 30•11 years ago
|
||
I've come across something similar, which might in fact be the same issue, but with a 100% reproducibility rate. Latest patch on bug 840487 (we should probably wait for that to land to make debugging this easier), under mozmill 2.0 with a en-US build passes nicely. However with a jp or fr build i _never_ get a PASS statement and the final output looks like > RESULTS | Passed: 0 > RESULTS | Failed: 0 > RESULTS | Skipped: 0 de builds work fine. Preliminary info might suggest encoding problems. Previous research showed the problem lying somewhere in the socket communication layer. So UTF or encoding problems related to special charactes _might_ be a cause. Will investigate more deeply.
Assignee | ||
Comment 31•11 years ago
|
||
Same happens on OS X when running: mozmill -b /mozilla/bin/nightly/firefox -t mutt/mutt/tests/js/frame/user_shutdown/multiple/test5.js So this is most likely a shutdown problem we might be able to fix with the landing of bug 865690.
Depends on: 865690
Assignee | ||
Comment 32•11 years ago
|
||
Oh, I meant Linux not OS X.
OS: Windows XP → All
Hardware: x86 → All
Comment 33•11 years ago
|
||
That sounds very reasonable. We send information through the Socket, but it looks like we never receive it. If our connection shuts down before all messages come through looks like a very plausible explanation.
Comment 34•11 years ago
|
||
I also noticed variances from run to run in both expect_assert.js and screenshot.js in the course of evaluating whimboo's patch for bug 865690. Simply running in clean master multiple times, there were similar variances, from run to run. Some would contain a TEST-END, TEST-START, some not. Configuration: Windows XP32 SP3 Latest Nightly 25.0a1 20130625031238 I can provide logs if required, but won't initially so I don't clutter up the attachments list.
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js → Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js, screenshot.js
Comment 35•11 years ago
|
||
And for context, whimboo has already received those logs via email.
Assignee | ||
Updated•11 years ago
|
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js, screenshot.js → Mozmill returns PASS/SKIP/FAIL == 0 for expect_assert.js, screenshot.js
Comment 36•11 years ago
|
||
Disregard the failure mentioned in comment 30. Created bug 889817 for it since its an entirely different issue. Most of this bug is no longer reproducible. Lets wait for bug 865690 to land. We'll probably be able to close this one as well then.
Comment 37•11 years ago
|
||
I retested with the latest mozmill 2.0 application-shutdown changes landed from bug 865690, out of interest. I did about a dozen `mutt testjs` runs. With respect to this open bug, as would be expected I am still seeing the same problem and variances run to run with expect_assert.js and screenshot.js, described in Comment 34. Config: Windows XP32 SP3 mozmill-env w/mozmill 2.0rc4 latest Nightly 25.0a1 20130704031323
Comment 38•11 years ago
|
||
Thanks Jonathan for the update! I'll try to replicate your results on a Win XP box. I haven't been able to see the problem in Windows7 anymore.
Comment 39•11 years ago
|
||
I have not been able to reproduce this on WinXP (50 clean runs). But I see it on Win7 now. Will try to dig it up again.
Comment 40•11 years ago
|
||
Reproducing this with --debug shows that no Events are triggered after shutdown. Doesn't seem related to bug 865690 Only visible thing from running with --debug is that the TEST-START event is logged later then when it pases.
Comment 41•11 years ago
|
||
(In reply to Andrei Eftimie from comment #40) > Doesn't seem related to bug 865690 Indeed/agreed.
Assignee | ||
Comment 42•11 years ago
|
||
Ok, I will take this bug so we can get this finished off. Otilia, please find a replacement for. I did a look at those tests and I'm not exactly sure yet what's causing this problem. It might be the size or the content of the data we are trying to send via jsbridge. I have to dig deeper into it. At least I have seen a strange functionName for one of the checks in expect_assert.js: {"fileName":"file:///c:/mozilla/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js","functionName":"testAssert/<","lineNumber":73,"message":" (Error). Doesn't throw on named exception - Got unwanted exception"} Please see the 'testAssert/<' part. It might be not related but I will check this first and fix it. I will keep you informed about updates.
Assignee: andrei.eftimie → hskupin
Flags: needinfo?(otilia.anica)
Assignee | ||
Comment 43•11 years ago
|
||
Ok, after digging around a lot without success, I finally picked up the content size and mentioned that to Clint. He replied that we might have an overflow while sending the data. That explosively raised some ideas in my head and I tried a short sleep in wrapper()@frame.js. This let the tests pass! Wohoo! So yes, we send data too quickly. So I looked into the NSS and Sockets module of JSBridge. I tried to get a result for the PR_Send call but no exception gets raised. So I tried to check the return value of the method, and noticed that it might be the length of the data sent. In case of this failure I was able to see a lot of -1 values. So I checked on MXR for the PR_Send usage, and found out that we have to retry sending if -1 gets returned. A patch I have implemented is exactly doing that now, and I can't see the failure anymore. All affected tests are passing now.
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for expect_assert.js, screenshot.js → Sending data too quickly through the JSBridge socket causes an overflow and lots of event data can get lost (expect_assert.js, screenshot.js)
Assignee | ||
Comment 44•11 years ago
|
||
This patch fixes the issue for me and does not introduce new problems on any platform. Andrei and Jonathan, can you both please test this on your machines, given that you were also able to see it? Thanks.
Attachment #775049 -
Flags: review?(ctalbert)
Attachment #775049 -
Flags: feedback?(tojonmz)
Attachment #775049 -
Flags: feedback?(andrei.eftimie)
Assignee | ||
Updated•11 years ago
|
Attachment #775049 -
Attachment description: Patch v1 → WIP v1
Comment 45•11 years ago
|
||
I don't seem to see the original problem anymore, at least not to the full extent. I'm running: > mozmill -m mutt/mutt/tests/js/assertions/tests.ini -b /c/Mozilla/Nightly/firefox-25.0a1.en-US.win32/firefox/firefox.exe And with the WIP patch I get 5 out of 6 runs: > TEST-START | expect_assert.js | testExpect > TEST-PASS | expect_assert.js | testAssert > TEST-END | expect_assert.js | finished in 48ms > RESULTS | Passed: 1 > RESULTS | Failed: 0 > RESULTS | Skipped: 0 It should register (I'm getting this correct result with the new patch in 1 out of 6 runs): > TEST-START | expect_assert.js | testExpect > TEST-PASS | expect_assert.js | testExpect > TEST-START | expect_assert.js | testAssert > TEST-PASS | expect_assert.js | testAssert > RESULTS | Passed: 2 > RESULTS | Failed: 0 > RESULTS | Skipped: 0 We still have some issues to solve.
Comment 46•11 years ago
|
||
Some runs with debug statements: http://pastebin.com/h7DEwJzW // 1 passed http://pastebin.com/A79ccKda // 0 passed
Comment 47•11 years ago
|
||
Previous links were with --console-level=DEBUG This one is with --debug: http://pastebin.com/nPErkkVW
Assignee | ||
Comment 48•11 years ago
|
||
Sorry, those should have been run with --debug so we can see which message get send over the bridge. As said on IRC I have never seen it failing anymore. Are you sure the patch has been applied correctly?
Assignee | ||
Comment 49•11 years ago
|
||
Are you sure you are on the latest master? Usually you should get more information with --debug and not only "* Fire event: 'mozmill.pass'". What I see is: * Sending message: '{"eventType":"mozmill.pass","result":{"pass":{"fileName":"file:///c:/mozilla/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js","functionName":"testAssert","lineNumber":118,"message":" (AssertionError). assert.deepEqual for [[object Object], [object Object], Objects are equal]"}}}' Please closely check that. I assume that's the reason why it does not work for you.
Comment 50•11 years ago
|
||
I am on the lastest master + this WIP patch applied.
Edit: the problem was with my mozmill-env, re-initialized it completely and it works now, seems somthing was off there.
I don't get anymore failures!!
Your fix appears to be working nicely.
We should probably log this message only in debug
> Failed sending message through the socket. Retrying.
Comment 51•11 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #44) > Created attachment 775049 [details] [diff] [review] > WIP v1 > Andrei and Jonathan, can you both please test this on your machines, given > that you were also able to see it? Thanks. Just back from several days off, trying it now also. Will report back.
Assignee | ||
Comment 52•11 years ago
|
||
(In reply to Andrei Eftimie from comment #50) > I am on the lastest master + this WIP patch applied. > Edit: the problem was with my mozmill-env, re-initialized it completely and > it works now, seems somthing was off there. Great. So my assumption was right. > We should probably log this message only in debug > > Failed sending message through the socket. Retrying. Correct. That's why it's a WIP patch. The final patch will also include a Python test.
Comment 53•11 years ago
|
||
Comment on attachment 775049 [details] [diff] [review] WIP v1 Review of attachment 775049 [details] [diff] [review]: ----------------------------------------------------------------- Missed the feedback flag, as stated above this works fine for me :) Great find and fix Henrik! ::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm @@ +69,5 @@ > + buffer = new NSS.Sockets.buffer(message.substring(sent)); > + count = NSS.Sockets.PR_Send(this.fd, buffer, buffer.length, > + 0, NSS.Sockets.PR_INTERVAL_MAX); > + if (count < 0) { > + dump('Failed sending message through the socket. Retrying.\n'); We should only log this when debug is active. (or don't log it at all)
Attachment #775049 -
Flags: feedback?(andrei.eftimie) → feedback+
Comment 54•11 years ago
|
||
Comment on attachment 775049 [details] [diff] [review] WIP v1 I did a dozen runs with latest master to re-re-produce the problem variances with expect_assert.js and screenshot.js. And then I did 20 runs with this WIP patch overlayed; and the missing result lines are now correctly present in all 20 patched runs: TEST-PASS | js\assertions\expect_assert.js | testAssert TEST-END | js\assertions\expect_assert.js TEST-PASS | js\controller\screenshot.js | test TEST-PASS | js\controller\screenshot.js | testChrome I diffed all the 20 output files against each other, and grep'd them with wc to be absolutely sure. Config: Windows XP32 SP3 mozmill-env w/mozmill 2.0rc4 latest latest Nightly 25.0a1 20130716030202 138524 I too, see many of the intended 'Failure sending message through socket' diagnostics in the output. I gather the volume of them is expected. An amount ranging from about 500 to 1,500 of them in an output file, for each `mutt testjs` run.
Attachment #775049 -
Flags: feedback?(tojonmz) → feedback+
Assignee | ||
Comment 55•11 years ago
|
||
Thanks Jonathan! That's great to hear that it is also fixed for you. And as stated above this output is wanted to show you how many information we were leaking before. So by tomorrow I will update the patch and also include a python test so we can be sure this dataloss will never happen again under those circumstances.
Severity: normal → critical
Keywords: dataloss
QA Contact: hskupin
Whiteboard: [ateamtrack: p=mozmill q=2013q2 m=0][mozmill-2.0+] → [mozmill-2.0+]
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(otilia.anica)
Assignee | ||
Comment 56•11 years ago
|
||
Comment on attachment 775049 [details] [diff] [review] WIP v1 I will re-ask for feedback once the patch is ready.
Attachment #775049 -
Flags: review?(ctalbert)
Assignee | ||
Comment 57•11 years ago
|
||
I know that I said a test would be good to have, but I was not able to create a reliable one which always fails with the old implementation. I got more passes. So I don't think that I should spend more time on trying to deeper investigate that. We should get this patch in and should be happy. All of our tests are still working and even others are fixed with this patch landed.
Attachment #775049 -
Attachment is obsolete: true
Attachment #777394 -
Flags: review?(ctalbert)
Comment 58•11 years ago
|
||
I will do some new runs with the latest patch and report back.
Comment 59•11 years ago
|
||
Another dozen runs minus the patch, just to confirm the test environment was valid. And 20 runs with the final patch v1 in Comment 57 overlayed; the missing result lines are still correctly present in all 20 runs. Pass: 61 Fail: 0 Skip: 2 Config: Windows XP32 SP3 mozmill-env w/mozmill 2.0rc4 latest latest Nightly 25.0a1 20130717030207 138720
Assignee | ||
Comment 60•11 years ago
|
||
Fantastic. That's really great to hear!
Comment 61•11 years ago
|
||
Comment on attachment 777394 [details] [diff] [review] Patch v1 Review of attachment 777394 [details] [diff] [review]: ----------------------------------------------------------------- This looks great. r+ with the one comment addressed. I tested it on windows 7 64, and couldn't get anything to fail. So I think we might have finally nailed this one. Good job Henrik, Andrei and Jonathan! ::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm @@ +70,5 @@ > + buffer = new NSS.Sockets.buffer(message.substring(sent)); > + count = NSS.Sockets.PR_Send(this.fd, buffer, buffer.length, > + 0, NSS.Sockets.PR_INTERVAL_MAX); > + if (count < 0) { > + Log.dump('PR_Send', 'Failure sending data. Retrying...'); If we lose the socket then this is going to retry forever. Let's put something in here to prevent that case. The safest is probably an upper limit on retries. Often when the socket goes away our handle to it (this.fd) is still valid, so we can't simply test for that being not null. I'd recommend putting something in where we retry some number of times and potentially break out of the loop if we hit max retries. I'd say a 100 retries should be plenty (pulling a number out of thin air).
Attachment #777394 -
Flags: review?(ctalbert) → review+
Assignee | ||
Comment 62•11 years ago
|
||
So I was trying to find something better and also found it. If count is -1 an error is expected, and you can retrieve it via PR_GetError(). So I have implemented this method additionally to the other Socket methods. The error value in those cases is PR_WOULD_BLOCK_ERROR. After some searching I found the following webpage which fantastically describes what's going on: http://www.mail-archive.com/dev-tech-crypto@lists.mozilla.org/msg09750.html So a better solution here is to check for that error and retry to send the data. At some point the send buffer will be empty again and we can send our message. If another error occurs we will break the loop. That way we do not need a hardcoded limit, which I usually never like. :) The patch is coming up in a bit with some refactoring of NSS.jsm.
Assignee | ||
Comment 63•11 years ago
|
||
Attachment #777394 -
Attachment is obsolete: true
Attachment #778028 -
Flags: review?(ctalbert)
Assignee | ||
Comment 64•11 years ago
|
||
The last version of the patch had a double quote due to the autocomplete feature of sublimetext. This is fixed now.
Attachment #778028 -
Attachment is obsolete: true
Attachment #778028 -
Flags: review?(ctalbert)
Attachment #778034 -
Flags: review?(ctalbert)
Comment 65•11 years ago
|
||
Comment on attachment 778034 [details] [diff] [review] 0001-Bug-764640-Sending-data-too-quickly-through-the-JSBr.patch Review of attachment 778034 [details] [diff] [review]: ----------------------------------------------------------------- I think this looks good. Tested it again on win 7 64, everything came through with flying colors. ::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm @@ +65,5 @@ > + var buffer; > + var count; > + var sent = 0; > + > + // We have to send a final '\0' so we need one more char Good addition. @@ +73,5 @@ > + 0, NSS.Sockets.PR_INTERVAL_MAX); > + if (count < 0) { > + var error = NSS.Sockets.PR_GetError(); > + if (error !== NSS.Types.PR_WOULD_BLOCK_ERROR) { > + Log.dump("PR_Send", "Failed with error " + error); Much better than my idea. Thanks.
Attachment #778034 -
Flags: review?(ctalbert) → review+
Assignee | ||
Comment 66•11 years ago
|
||
Landed on master: https://github.com/mozilla/mozmill/commit/42aa3cb84ba201b2fc6368cf193302311e4a78cb
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [mozmill-2.0+] → [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1]
Updated•8 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•