Closed Bug 764640 Opened 8 years ago Closed 7 years ago

Sending data too quickly through the JSBridge socket causes an overflow and lots of event data can get lost (expect_assert.js, screenshot.js)

Categories

(Testing Graveyard :: Mozmill, defect, critical)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Keywords: dataloss, Whiteboard: [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1])

Attachments

(4 files, 3 obsolete files)

When I run mozmill master's mutt tests for the assertions lib, I get zero test results returned.

ERROR | ProcessManager UNABLE to use job objects to manage child processes
ERROR |
INFO | ProcessManager NOT managing child processes
INFO |
DEBUG | mozmill.startRunner | true
DEBUG | mozmill.setModule | "currentModule"
DEBUG | mozmill.setState | "currentState"
TEST-START | z:/data/code/tools/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js | testExpect
[..]
RESULTS | Passed: 0
RESULTS | Failed: 0
RESULTS | Skipped: 0
All tests were successful.  Ship it!

Flagging for 2.0 for now until this has been investigated.
FWIW, the ERROR | output is from outputting to stderr in mozprocess which mozlogger converts to an error line:

https://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/logger.py#L86

Unrelated to the 0 results but worth noting
This is related to the throws/doesNotThrow tests in the test module.
Blocks: 637880
+'ing for investigation.
Whiteboard: [mozmill-2.0?] → [mozmill-2.0+]
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
I have re-setup a new Windows XP machine and I can always see this failure. Other versions of Windows are seem to be fine. At least on Windows 7 I'm not able to reproduce.
This is clearly a failure in the framework. In some cases we fail to send the endTest event so the Python code doesn't pick up the pass/fail/skip count. More details later.
Attached file broken json data
This attachment contains the JSON code which we fail to send via JSBridge. I can get rid of the failure with my patch on bug 781376, which means that something invalid is in this object. I will further investigate tomorrow.
Interesting. Now that bug 781376 has been resolved I assume we'd see this issue much less often, but it would be great to find out the cause.
We would if an assertion fails with the specific data in the details. So we can simply negate the test to make it happen. So it shoudl still be reproducible 100% of the time. I will continue to work on this bug after bug 761564 is fixed. If you want to check feel free to take some work out of my plate. Thanks.
Whiteboard: [mozmill-2.0+] → [mozmill-2.0+] s=130408 u=failure c=mozmill p=1
Assignee: hskupin → andrei.eftimie
Trying to figure this one out.

When running python tests we get something like this:
> RESULTS | Passed: 1
> RESULTS | Failed: 0
> RESULTS | Skipped: 0

Running any js tests fails to print any "RESULTS", the whole block is missing.

Is this intended?
If so, this bug might not be relevant anymore.
If not, we should file another bug for re-enabling the RESULTS block for JS tests.
Flags: needinfo?(hskupin)
If we don't print the final results it's certainly worth another bug we should fix first.
Flags: needinfo?(hskupin)
Depends on: 859746
I've set up a WinXP box with mozmill 2.0, but I am unable to reproduce this.

According to comment 8 we should be able to reproduce the problem by 'negating the test':
> So we can simply negate the test to make it happen. So it shoudl still be reproducible 100% of the time.

While running the assertions/expect_assert.js test I always get 2 tests logged.
I've tried changing the TEST_DATA objects or the conditions from within the tests to make them fail.
So I either get 2 PASSED test or 2 FAILED tests.

Henrik, is there something particular that needs "negating" here?
Flags: needinfo?(hskupin)
I'm still seeing this with latest Mozmill code against the Nightly build by running

mozmill -b ../bin/nightly/firefox.exe -m mutt/mutt/tests/js/assertions/tests.ini
Flags: needinfo?(hskupin)
This time I see this on Windows 7, so it's not XP only.
Thanks Henrik, I can also see it on Win7; XP works fine for me.
Finally a starting point.
Whiteboard: [mozmill-2.0+] s=130408 u=failure c=mozmill p=1 → [ateamtrack: p=mozmill q=2013q2 m=1][mozmill-2.0+]
Whiteboard: [ateamtrack: p=mozmill q=2013q2 m=1][mozmill-2.0+] → [ateamtrack: p=mozmill q=2013q2 m=0][mozmill-2.0+]
An update to what I have found so far.

frame.js is firing the mozmill.endTest event here:
https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/modules/frame.js#L442

but sometimes the endTest method which would append the results to the results list here: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L69 is not run

It might also be that this is indeed fired, but to late, (testrun finishes, handler gets destroyed before receiving all data?), but I have no proof to support this. 

If this is the case we might need to correctly wait for all reporting to finish before generating the final report.
You should add debugging statements to both sides to check which events are getting fired at which times. That helped me a lot in the past.
I've added time to my print/dump statements. Formatted Hour:Minute::Second:Milisecond
From 2 tests - randomly - none, one or both would be logged.

Flow:

frame.js => fires the endTest event
frame.js => takes the event and fires it through Events.jsm
mozmill/__init__.py :: endTest_listener => receives the event and appends test results

When our problem occurs the python never triggers the endTest_listener method,
which might indicate a problem in Events.jsm

The times look a bit off, but I suspect JS and Python dates are not completely in sync?

Here is a dump where no test was logged:
> ←[1;34mTEST-START←[0m | expect_assert.js | testExpect
> 
>  11:16::27:410 FIRE endTest EVENT (frame.js)
> 
>  11:16::27:410 mozmill.endTest (frame.js => Events.jsm)
>  11:16::27:428 FIRE endTest EVENT (frame.js)
> 
> 
> 11:16::27:817000 final reporting
> 
> RESULTS | Passed: 0
> RESULTS | Failed: 0
> RESULTS | Skipped: 0
>  11:16::27:428 mozmill.endTest (frame.js => Events.jsm)


Here is a dump where one test was logged:
> ←[1;34mTEST-START←[0m | expect_assert.js | testExpect
> 
>  11:18::26:882 FIRE endTest EVENT (frame.js)
> 
>  11:18::26:882 mozmill.endTest (frame.js => Events.jsm)
>  11:18::26:916 FIRE endTest EVENT (frame.js)
> 
> ←[1;32mTEST-PASS←[0m | expect_assert.js | testAssert
> 
>  11:18::26:910000 APPEND TEST RESULTS (mozmill/__init__.py => endTest_listener)
> 
> 
> TEST-END | expect_assert.js | finished in 64ms
> 
> 11:18::27:331000 final reporting
> 
> RESULTS | Passed: 1
> RESULTS | Failed: 0
> RESULTS | Skipped: 0
>  11:18::26:918 mozmill.endTest (frame.js => Events.jsm)

I am not sure how we should proceed further.
If the Python side doesn't get the endTest event, you will have to add debugging statements to jsbridge code. That will show you in detail what's getting transferred or where we bail out with a failure.
I've dug around deeper and I've reached the point where we open a Socket create a buffer with the message and send it:
https://github.com/mozilla/mozmill/blob/master/jsbridge/jsbridge/extension/resource/modules/Sockets.jsm#L64

In both cases (when we fail and when we succeed) all parameters are equal (except for the timestamp in the message and - probably - the equivalent in the buffer).

Attaching output with dump statements for when we see both tests correctly logged, and when we see only 1.

I'm not sure how to monitor what gets sent through the socket.

**Note. I have noticed that the more stuff I dumped to the console, the harder was it to get a failure. With the dumps seen in the attachements I had no results with 0 and one out of 20 runs with 1 result.
Without any dumps the ratio would have been roughly 1 in 6 and 1 in 3.

I could infer that this would add some delay that would make the test not fail to report. But nothing conclusive.
Henrik, any idea on how we could proceed?
Based on what I've found the problem seems to lie within the Socket communication.

I'm thinking of setting up some sort of proxy to see all communication, but I'm not that it would be feasible - or even possible. Have not tried to eavesdrop on a Socket until now.
Flags: needinfo?(hskupin)
Would you mind to reduce the assertion test? It might be that a specific check in this test and its result cause this awkward behavior. If there is such a check it would be easier for us to track down why it's failing.
Flags: needinfo?(hskupin)
(In reply to Andrei Eftimie from comment #21)
> Henrik, any idea on how we could proceed?
> Based on what I've found the problem seems to lie within the Socket
> communication.
> 
> I'm thinking of setting up some sort of proxy to see all communication, but
> I'm not that it would be feasible - or even possible. Have not tried to
> eavesdrop on a Socket until now.
If you put dump statements in jsbridge's recv/send code you can see all the information going over the socket. jsbridge/extension/resource/modules/Sockets.jsm
Andrei, have you seen my comment 22? Can you please re-check by reducing the assertion test?
Flags: needinfo?(andrei.eftimie)
I am unable to reproduce this anymore. Same machine, updated FF and Mozmill.
Older FF versions still have this problem.

With tinderbox builds I have found the following pushlog which fixes the problem: http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=eeb89381c7a3

This is a merge from birch.
Not sure what could have fixed this. We have some changes to a socket but they are advertised as Unix only (and we only had this problem in windows): http://hg.mozilla.org/mozilla-central/rev/6c52203a47b8

Since our problem appeared to be in a Socket connection, might this have fixed it?

Henrik, can you please see if you can reproduce this with the latest Nightly?
Flags: needinfo?(andrei.eftimie) → needinfo?(hskupin)
(In reply to Andrei Eftimie from comment #25)
> With tinderbox builds I have found the following pushlog which fixes the
> problem:
> http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=eeb89381c7a3
>
> This is a merge from birch.

That doesn't make any sense given that it was a merge from mozilla-central to birch and not the other way around. Nothing in here should have changed the behavior for Nightly or a tinderbox build for Nightly.

I tried to run the test in my Win7 VM but it was not successful given the failure:

    "message": "redeclaration of var Cu",
    "fileName": "resource://mozmill/stdlib/securable-module.js -> resource://mozmill/modules/assertions.js",
    "name": "TypeError",
    "lineNumber": 7

Do you see the same? It's only happening for me on Windows with latest code of Mozmill but not on Linux.
Flags: needinfo?(hskupin)
I have no failures on Win 7.
Double checked that I am running latest versions of mozmill-env, mozmill, Nightly.

> $ mozmill -b /c/Mozilla/Nightly/firefox-24.0a1.en-US.win32/firefox/firefox.exe -                                                                                                  m /c/work/mozmill/mutt/mutt/tests/js/assertions/tests.ini
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testExpect
> TEST-START | expect_assert.js | testAssert
> TEST-PASS | expect_assert.js | testAssert
> TEST-END | expect_assert.js | finished in 154ms
> RESULTS | Passed: 2
> RESULTS | Failed: 0
> RESULTS | Skipped: 0
> 
> svuser@PXXXWIN7VM ~
> $ mutt -b /c/Mozilla/Nightly/firefox-24.0a1.en-US.win32/firefox/firefox.exe -m /                                                                                                  c/work/mozmill/mutt/mutt/tests/js/assertions/tests.ini
> Running python tests
> 
> ----------------------------------------------------------------------
> Ran 0 tests in 0.000s
> 
> OK
> Running JS Tests
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testExpect
> TEST-START | expect_assert.js | testAssert
> TEST-PASS | expect_assert.js | testAssert
> TEST-END | expect_assert.js | finished in 64ms
> ===========================================================================
> Python Failures:
> 
> ===========================================================================
> Javascript Failures:
> 
> Total passed: 2
> Total failed: 0
> Total skipped: 0

================

I got that pushlog by opening "about:buildconfig" in the first tinderbox build that didn't reproduce the problem, and clicked on the link "built from"
If we can't repro anymore than mark as WFM?
Hopefully yes.
I want someone has has previously seen this to confirm that it isn't happening anymore.
I've come across something similar, which might in fact be the same issue, but with a 100% reproducibility rate.

Latest patch on bug 840487 (we should probably wait for that to land to make debugging this easier), under mozmill 2.0 with a en-US build passes nicely.
However with a jp or fr build i _never_ get a PASS statement and the final output looks like 
> RESULTS | Passed: 0
> RESULTS | Failed: 0
> RESULTS | Skipped: 0

de builds work fine.
Preliminary info might suggest encoding problems.

Previous research showed the problem lying somewhere in the socket communication layer. 
So UTF or encoding problems related to special charactes _might_ be a cause.

Will investigate more deeply.
Same happens on OS X when running:

mozmill -b /mozilla/bin/nightly/firefox -t mutt/mutt/tests/js/frame/user_shutdown/multiple/test5.js

So this is most likely a shutdown problem we might be able to fix with the landing of bug 865690.
Depends on: 865690
Oh, I meant Linux not OS X.
OS: Windows XP → All
Hardware: x86 → All
That sounds very reasonable.

We send information through the Socket, but it looks like we never receive it.
If our connection shuts down before all messages come through looks like a very plausible explanation.
I also noticed variances from run to run in both expect_assert.js and screenshot.js in the course of evaluating whimboo's patch for bug 865690. Simply running in clean master multiple times, there were similar variances, from run to run. Some would contain a TEST-END, TEST-START, some not.

Configuration:

Windows XP32 SP3
Latest Nightly 25.0a1 20130625031238

I can provide logs if required, but won't initially so I don't clutter up the attachments list.
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js → Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js, screenshot.js
And for context, whimboo has already received those logs via email.
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for test_assertions.js, screenshot.js → Mozmill returns PASS/SKIP/FAIL == 0 for expect_assert.js, screenshot.js
Disregard the failure mentioned in comment 30.
Created bug 889817 for it since its an entirely different issue.
Most of this bug is no longer reproducible.

Lets wait for bug 865690 to land.
We'll probably be able to close this one as well then.
I retested with the latest mozmill 2.0 application-shutdown changes landed from bug 865690, out of interest. 

I did about a dozen `mutt testjs` runs.

With respect to this open bug, as would be expected I am still seeing the same problem and variances run to run with expect_assert.js and screenshot.js, described in Comment 34.

Config:
Windows XP32 SP3
mozmill-env w/mozmill 2.0rc4
latest Nightly 25.0a1 20130704031323
Thanks Jonathan for the update!

I'll try to replicate your results on a Win XP box.
I haven't been able to see the problem in Windows7 anymore.
I have not been able to reproduce this on WinXP (50 clean runs).
But I see it on Win7 now.

Will try to dig it up again.
Reproducing this with --debug shows that no Events are triggered after shutdown.
Doesn't seem related to bug 865690

Only visible thing from running with --debug is that the TEST-START event is logged later then when it pases.
(In reply to Andrei Eftimie from comment #40)
> Doesn't seem related to bug 865690

Indeed/agreed.
Ok, I will take this bug so we can get this finished off. Otilia, please find a replacement for.

I did a look at those tests and I'm not exactly sure yet what's causing this problem. It might be the size or the content of the data we are trying to send via jsbridge. I have to dig deeper into it. At least I have seen a strange functionName for one of the checks in expect_assert.js:

{"fileName":"file:///c:/mozilla/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js","functionName":"testAssert/<","lineNumber":73,"message":" (Error). Doesn't throw on named exception - Got unwanted exception"}

Please see the 'testAssert/<' part. It might be not related but I will check this first and fix it. I will keep you informed about updates.
Assignee: andrei.eftimie → hskupin
Flags: needinfo?(otilia.anica)
Ok, after digging around a lot without success, I finally picked up the content size and mentioned that to Clint. He replied that we might have an overflow while sending the data. That explosively raised some ideas in my head and I tried a short sleep in wrapper()@frame.js. This let the tests pass! Wohoo! So yes, we send data too quickly.

So I looked into the NSS and Sockets module of JSBridge. I tried to get a result for the PR_Send call but no exception gets raised. So I tried to check the return value of the method, and noticed that it might be the length of the data sent. In case of this failure I was able to see a lot of -1 values. So I checked on MXR for the PR_Send usage, and found out that we have to retry sending if -1 gets returned.

A patch I have implemented is exactly doing that now, and I can't see the failure anymore. All affected tests are passing now.
Summary: Mozmill returns PASS/SKIP/FAIL == 0 for expect_assert.js, screenshot.js → Sending data too quickly through the JSBridge socket causes an overflow and lots of event data can get lost (expect_assert.js, screenshot.js)
Attached patch WIP v1 (obsolete) — Splinter Review
This patch fixes the issue for me and does not introduce new problems on any platform.

Andrei and Jonathan, can you both please test this on your machines, given that you were also able to see it? Thanks.
Attachment #775049 - Flags: review?(ctalbert)
Attachment #775049 - Flags: feedback?(tojonmz)
Attachment #775049 - Flags: feedback?(andrei.eftimie)
Attachment #775049 - Attachment description: Patch v1 → WIP v1
I don't seem to see the original problem anymore, at least not to the full extent.

I'm running:
> mozmill -m mutt/mutt/tests/js/assertions/tests.ini -b /c/Mozilla/Nightly/firefox-25.0a1.en-US.win32/firefox/firefox.exe

And with the WIP patch I get 5 out of 6 runs:
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testAssert
> TEST-END | expect_assert.js | finished in 48ms
> RESULTS | Passed: 1
> RESULTS | Failed: 0
> RESULTS | Skipped: 0

It should register (I'm getting this correct result with the new patch in 1 out of 6 runs):
> TEST-START | expect_assert.js | testExpect
> TEST-PASS | expect_assert.js | testExpect
> TEST-START | expect_assert.js | testAssert
> TEST-PASS | expect_assert.js | testAssert
> RESULTS | Passed: 2
> RESULTS | Failed: 0
> RESULTS | Skipped: 0

We still have some issues to solve.
Some runs with debug statements:
http://pastebin.com/h7DEwJzW // 1 passed
http://pastebin.com/A79ccKda // 0 passed
Previous links were with --console-level=DEBUG

This one is with --debug: http://pastebin.com/nPErkkVW
Sorry, those should have been run with --debug so we can see which message get send over the bridge. As said on IRC I have never seen it failing anymore. Are you sure the patch has been applied correctly?
Are you sure you are on the latest master? Usually you should get more information with --debug and not only "* Fire event: 'mozmill.pass'". What I see is:

* Sending message: '{"eventType":"mozmill.pass","result":{"pass":{"fileName":"file:///c:/mozilla/mozmill/mutt/mutt/tests/js/assertions/expect_assert.js","functionName":"testAssert","lineNumber":118,"message":" (AssertionError). assert.deepEqual for [[object Object], [object Object], Objects are equal]"}}}'

Please closely check that. I assume that's the reason why it does not work for you.
I am on the lastest master + this WIP patch applied.
Edit: the problem was with my mozmill-env, re-initialized it completely and it works now, seems somthing was off there.

I don't get anymore failures!!
Your fix appears to be working nicely.

We should probably log this message only in debug
> Failed sending message through the socket. Retrying.
(In reply to Henrik Skupin (:whimboo) from comment #44)
> Created attachment 775049 [details] [diff] [review]
> WIP v1
> Andrei and Jonathan, can you both please test this on your machines, given
> that you were also able to see it? Thanks.

Just back from several days off, trying it now also. Will report back.
(In reply to Andrei Eftimie from comment #50)
> I am on the lastest master + this WIP patch applied.
> Edit: the problem was with my mozmill-env, re-initialized it completely and
> it works now, seems somthing was off there.

Great. So my assumption was right.

> We should probably log this message only in debug
> > Failed sending message through the socket. Retrying.

Correct. That's why it's a WIP patch. The final patch will also include a Python test.
Comment on attachment 775049 [details] [diff] [review]
WIP v1

Review of attachment 775049 [details] [diff] [review]:
-----------------------------------------------------------------

Missed the feedback flag, as stated above this works fine for me :)
Great find and fix Henrik!

::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm
@@ +69,5 @@
> +      buffer = new NSS.Sockets.buffer(message.substring(sent));
> +      count = NSS.Sockets.PR_Send(this.fd, buffer, buffer.length,
> +                                  0, NSS.Sockets.PR_INTERVAL_MAX);
> +      if (count < 0) {
> +        dump('Failed sending message through the socket. Retrying.\n');

We should only log this when debug is active. (or don't log it at all)
Attachment #775049 - Flags: feedback?(andrei.eftimie) → feedback+
Comment on attachment 775049 [details] [diff] [review]
WIP v1

I did a dozen runs with latest master to re-re-produce the problem variances with expect_assert.js and screenshot.js. And then I did 20 runs with this WIP patch overlayed; and the missing result lines are now correctly present in all 20 patched runs:

TEST-PASS | js\assertions\expect_assert.js | testAssert
TEST-END | js\assertions\expect_assert.js

TEST-PASS | js\controller\screenshot.js | test
TEST-PASS | js\controller\screenshot.js | testChrome

I diffed all the 20 output files against each other, and grep'd them with wc to be absolutely sure.

Config:
Windows XP32 SP3
mozmill-env w/mozmill 2.0rc4 latest
latest Nightly 25.0a1 20130716030202 138524

I too, see many of the intended 'Failure sending message through socket' diagnostics in the output. I gather the volume of them is expected. An amount ranging from about 500 to 1,500 of them in an output file, for each `mutt testjs` run.
Attachment #775049 - Flags: feedback?(tojonmz) → feedback+
Thanks Jonathan! That's great to hear that it is also fixed for you. And as stated above this output is wanted to show you how many information we were leaking before. So by tomorrow I will update the patch and also include a python test so we can be sure this dataloss will never happen again under those circumstances.
Severity: normal → critical
Keywords: dataloss
QA Contact: hskupin
Whiteboard: [ateamtrack: p=mozmill q=2013q2 m=0][mozmill-2.0+] → [mozmill-2.0+]
Flags: needinfo?(otilia.anica)
Comment on attachment 775049 [details] [diff] [review]
WIP v1

I will re-ask for feedback once the patch is ready.
Attachment #775049 - Flags: review?(ctalbert)
Attached patch Patch v1 (obsolete) — Splinter Review
I know that I said a test would be good to have, but I was not able to create a reliable one which always fails with the old implementation. I got more passes. So I don't think that I should spend more time on trying to deeper investigate that. We should get this patch in and should be happy. All of our tests are still working and even others are fixed with this patch landed.
Attachment #775049 - Attachment is obsolete: true
Attachment #777394 - Flags: review?(ctalbert)
I will do some new runs with the latest patch and report back.
Another dozen runs minus the patch, just to confirm the test environment was valid. And 20 runs with the final patch v1 in Comment 57 overlayed; the missing result lines are still correctly present in all 20 runs.

Pass: 61
Fail: 0
Skip: 2

Config:
Windows XP32 SP3
mozmill-env w/mozmill 2.0rc4 latest
latest Nightly 25.0a1 20130717030207 138720
Fantastic. That's really great to hear!
Comment on attachment 777394 [details] [diff] [review]
Patch v1

Review of attachment 777394 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great. r+ with the one comment addressed.  I tested it on windows 7 64, and couldn't get anything to fail.  So I think we might have finally nailed this one.  Good job Henrik, Andrei and Jonathan!

::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm
@@ +70,5 @@
> +      buffer = new NSS.Sockets.buffer(message.substring(sent));
> +      count = NSS.Sockets.PR_Send(this.fd, buffer, buffer.length,
> +                                  0, NSS.Sockets.PR_INTERVAL_MAX);
> +      if (count < 0) {
> +        Log.dump('PR_Send', 'Failure sending data. Retrying...');

If we lose the socket then this is going to retry forever.  Let's put something in here to prevent that case. The safest is probably an upper limit on retries. Often when the socket goes away our handle to it (this.fd) is still valid, so we can't simply test for that being not null.  I'd recommend putting something in where we retry some number of times and potentially break out of the loop if we hit max retries.  I'd say a 100 retries should be plenty (pulling a number out of thin air).
Attachment #777394 - Flags: review?(ctalbert) → review+
So I was trying to find something better and also found it. If count is -1 an error is expected, and you can retrieve it via PR_GetError(). So I have implemented this method additionally to the other Socket methods. The error value in those cases is PR_WOULD_BLOCK_ERROR. After some searching I found the following webpage which fantastically describes what's going on:

http://www.mail-archive.com/dev-tech-crypto@lists.mozilla.org/msg09750.html

So a better solution here is to check for that error and retry to send the data. At some point the send buffer will be empty again and we can send our message. If another error occurs we will break the loop. That way we do not need a hardcoded limit, which I usually never like. :)

The patch is coming up in a bit with some refactoring of NSS.jsm.
Attachment #777394 - Attachment is obsolete: true
Attachment #778028 - Flags: review?(ctalbert)
The last version of the patch had a double quote due to the autocomplete feature of sublimetext. This is fixed now.
Attachment #778028 - Attachment is obsolete: true
Attachment #778028 - Flags: review?(ctalbert)
Attachment #778034 - Flags: review?(ctalbert)
Comment on attachment 778034 [details] [diff] [review]
0001-Bug-764640-Sending-data-too-quickly-through-the-JSBr.patch

Review of attachment 778034 [details] [diff] [review]:
-----------------------------------------------------------------

I think this looks good. Tested it again on win 7 64, everything came through with flying colors.

::: jsbridge/jsbridge/extension/resource/modules/Sockets.jsm
@@ +65,5 @@
> +    var buffer;
> +    var count;
> +    var sent = 0;
> +
> +    // We have to send a final '\0' so we need one more char

Good addition.

@@ +73,5 @@
> +                                  0, NSS.Sockets.PR_INTERVAL_MAX);
> +      if (count < 0) {
> +        var error = NSS.Sockets.PR_GetError();
> +        if (error !== NSS.Types.PR_WOULD_BLOCK_ERROR) {
> +          Log.dump("PR_Send", "Failed with error " + error);

Much better than my idea. Thanks.
Attachment #778034 - Flags: review?(ctalbert) → review+
Landed on master:
https://github.com/mozilla/mozmill/commit/42aa3cb84ba201b2fc6368cf193302311e4a78cb
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Whiteboard: [mozmill-2.0+] → [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1]
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.