Closed Bug 1113154 Opened 5 years ago Closed 5 years ago

Plivo sometimes raises a 404 error just after the call has been created

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RobertC, Assigned: jlorenzo)

Details

Attachments

(1 file)

46 bytes, text/x-github-pull-request
RobertC
: review+
Bebe
: review+
Details | Review
http://jenkins1.qa.scl3.mozilla.com/view/UI/job/flame-kk-319.mozilla-central.ui.functional.non-smoke.2/51/HTML_Report/

There have been a few test failures lately where the Plivo make call method returned error code 404.
Even though that error was thrown the call was still made, it is visible in the tearDown() failure in the link above.

Need to investigate why Plivo raised error 404 when the call was successful.

Another issue is that because the call was not killed in the tearDown, it kept working through the device restart. In the link you can see as the second failing test has an incoming call from the test before it.

Johan, does the hangup_all_calls() method from the Plivo API clear the calls from the entire account (auth_id/auth_token), thus affecting other tests that run Plivo at the same time?
Flags: needinfo?(jlorenzo)
QA Whiteboard: [fxosqa-auto-s6+]
hangup_all_calls() does a DELETE on /Call/ (see [1]). This REST request is not documented[2]. By reading the name of the function and if they followed the RESTful standards, I would say it hangs up each call of a given user; but it really depends on how they implemented the response server-side.

We can check this behavior when there is not that much activity.

[1] https://github.com/plivo/plivo-python/blob/master/plivo.py#L237
[2] https://www.plivo.com/docs/api/call/
Flags: needinfo?(jlorenzo)
QA Whiteboard: [fxosqa-auto-s6+] → [fxosqa-auto-s7]
I checked the hangup_all_calls() locally and it stops all calls, even from tests running on other devices. I will try to find another solution for the issue in the bug.
Looks like PLivo is retrying to make call if it failed to do it in the first try.

This might affect the next test running 

I managed to reproduce this issue once jlorenzo do you know anything about this behaviour?
Flags: needinfo?(jlorenzo)
This behavior is not in the official documentation. I have found something though. When you call the call Rest API, you have to provide a list of gateways through which you'll place your calls[1]. It will try to place call with every possible gateway until the call is actually made[2].

I checked what POST params we send from our tests by placing a print just before this line[3]. So the gateways are likely appended on the server-side. Even if we don't know anything about Plivo's server infrastructure, I think it's safe to assume they have at least 2 gateways. We can conclude that the behavior you saw Bebe is the expected one.

[1] https://github.com/plivo/plivoframework/blob/master/src/plivo/rest/freeswitch/api.py#L493 and https://github.com/plivo/plivoframework/blob/master/src/plivo/rest/freeswitch/api.py#L548
[2] https://github.com/plivo/plivoframework/blob/master/src/plivo/rest/freeswitch/inboundsocket.py#L626
[3] https://github.com/plivo/plivo-python/blob/master/plivo.py#L58
Flags: needinfo?(jlorenzo)
I'll look into this tomorrow.
Assignee: robert.chira → jlorenzo
We can fix the 404 problem here.

Sometimes, when we create a call with the Plive REST Api[1], Plivo doesn't create immediately the /Call/:uuid endpoint. The wait[2] we have just after is useless because we throw an exception in get_all_call[3].

The fix is simple here, we just have to mute the PlivoError exception. I'll provide it. 

Once that is done, we would see less "recalls". This will avoid us to wait for a second call in the tearDown part.
 
[1] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/utils/plivo/plivo_util.py#L38
[2] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/utils/plivo/plivo_util.py#L48
[3] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/utils/plivo/plivo_util.py#L65
Summary: [v2.2] Investigate Plivo account failures → Plivo sometimes raises a 404 error just after the call has been created
Attached file Gaia PR
After some thoughts, it's cleaner to raise PlivoActiveCallNotFound if Plivo returns a 404. This exception is already muted in the Wait.
Attachment #8544598 - Flags: review?(robert.chira)
Attachment #8544598 - Flags: review?(florin.strugariu)
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #7) 
> This exception is already muted in the Wait.

By "muted", I mean ignored[1]. So the wait will hold on properly until we find an actual call and won't propagate the exception immediately, like explained in the doc[2].

[1] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/utils/plivo/plivo_util.py#L48
[2] http://marionette-client.readthedocs.org/en/latest/reference.html?highlight=wait#marionette.Wait
3 errors over 101 tries in the adhoc job. These 3 are not related to the 404 issue initially brought here. I created a follow up bug to track the issue (bug 1118331).

To me it seems we earned more stability to this Plivo test.
Attachment #8544598 - Flags: review?(robert.chira) → review+
Attachment #8544598 - Flags: review?(florin.strugariu) → review+
You need to log in before you can comment on or make changes to this bug.