Closed Bug 951188 Opened 11 years ago Closed 10 years ago

[RTSP][V1.3] No error notification when the RTSP link fails to load

Categories

(Core :: Networking, defect)

28 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla29
blocking-b2g 1.3+
Tracking Status
firefox27 --- wontfix
firefox28 --- fixed
firefox29 --- fixed
b2g-v1.3 --- verified
b2g-v1.3T --- fixed
b2g-v1.4 --- verified

People

(Reporter: whsu, Assigned: ethan)

References

Details

(Whiteboard: [FT:RIL][qa-])

Attachments

(4 files, 2 obsolete files)

* Description:
  This problem happened on v1.3 build.
  Video app keeps loading when the RTSP streaming is offline

* Reproduction steps:
  1. Connect to a hotspot
  2. Launch the following page via browser
     - http://goo.gl/lE2eE3
  2. Tap the "Fake link"

* Expected result:
  Video app pops up a warning message

* Actual result:
  Video app keeps loading

* Reproduction build: V1.3 (Buri)
 - Gaia:     dca0a3dcf062ce3e422a9c56d141c14543c816fb
 - Gecko:    http://hg.mozilla.org/releases/mozilla-aurora/rev/1f7db4cc788e
 - BuildID   20131217004001
 - Version   28.0a2
Attached video WP_20131218_003.mp4
Attached the video
blocking-b2g: --- → 1.3?
Strange, we still have onerror callback hooked. In this case, we should receive an error, MediaError.MEDIA_ERR_DECODE or MEDIA_ERR_SRC_NOT_SUPPORTED, through onerror callback. If it is, we can display the error message.
Assignee: nobody → vchang
NI Sri for confirming blocking decision (per some recent email from Sri/Chris - video streaming is not a blocker for 1.3)
Flags: needinfo?(skasetti)
Not a blocker for 1.3
blocking-b2g: 1.3? → -
Flags: needinfo?(skasetti)
Nominated to 1.3?, as actually RTSP streaming function was landed and accessible in 1.3. We will triage to see if this defect is must-fix or not.
blocking-b2g: - → 1.3?
wesley, please triage, we are ignoring this one in the media triage (1/15)
Flags: needinfo?(whuang)
Flags: needinfo?(whuang)
Whiteboard: [FT:RIL]
defer enhancement to 1.4
blocking-b2g: 1.3? → 1.4?
I'm reworking the title here to the actual problem, as the dupe & discussion offline indicates that this happens across the board with any type of error when failing to load a RTSP URL in the video app.
Summary: [RTSP][V1.3] No warning message pops up and streaming keeps loading while RTSP streaming is offline → [RTSP][V1.3] No error notification when the RTSP link fails to load
This happens with audio RTSP URL links as well. I feel strongly that this is bad UX - right now, on any failure to load a RTSP URL for any reason, we'll start infinitely spinning forever in the video app without any indication to the user that the RTSP URL failed to load. A user won't have any knowledge of the fact that there is a problem here, so they'll be confused on what's causing the issue - is it a broken RTSP link? Are you offline? What happened?

This violates the principal of gulf of evaluation - http://en.wikipedia.org/wiki/Gulf_of_evaluation.

needinfo on UX to weigh in on the opinion of the severity of this bug
blocking-b2g: 1.4? → 1.3?
Flags: needinfo?(firefoxos-ux-bugzilla)
This is, as Jason points out, abysmal UX. During user testing, we see high user frustration when users do not understand *why* something will not work, such as why they've lost network connectivity, cannot send a text message, or cannot view a web page. Loading a video is considered pretty basic functionality, and users also use video a lot to stream and just listen to music without actually watching the video necessarily (Colombia research).

I am concerned about the impact of this on battery life and processing power: does it drain the battery or slow other things down? How easily can the user get out of this state to do something else in the same place?

UX preference would be to block on any of the above, and definitely on all of them.
Flags: needinfo?(firefoxos-ux-bugzilla)
Assignee: vchang → ettseng
We'd like to have an error message that is thrown to indicate to the user that an error has occurred.

Flagging UX to see how best to proceed.
blocking-b2g: 1.3? → 1.3+
Flags: needinfo?(firefoxos-ux-bugzilla)
Flagging this as late-l10n, 'cause it sounds like all this wants is a user message.
Keywords: late-l10n
There are two root causes of this bug:

1. Lack of application timeout for TCP connect 
Our RTSP client uses the general method of non-blocking TCP connect, i.e., calling select() every 1ms to check whether the TCP connection is established or not. However, it doesn't have an application level timeout mechanism to abort connect(). The actual timeout depends on the maximum TCP connect timeout value.
Since our Linux kernel sets /proc/sys/net/ipv4/tcp_syn_retries = 5, the timeout duration is around 180 seconds.

2. Lack of reporting error code from RTSP connection component
Right now the RTSP connection component does not report error code to its protocol listener.

In order to prompt an error message when an RTSP link fails to load, we need to resolve both these two issues.
Attached patch bug-951188-wip.patch (obsolete) — Splinter Review
What I did in this patch:
1. Set up error code in messages transmitted in RTSPConnectionHandler and ARTSPConnection.
   The purpose is to report an error code to RTSPSource::onDisconnected() in order to distinguish the normal and abnormal disconnected conditions.
2. Add a timeout mechanism for TCP connect. The timeout value is around 10 seconds.
Attachment #8367244 - Flags: review?(sworkman)
I tested this patch with the following cases:
1. Normal case (playing RTSP audio streaming)
2. Failed to establish RTSP session (host is unreachable or port is not listening)
3. RTSP session established, but failed to load the file (such as typo of the filename)

In case 2, an error message will prompt after 10 seconds.
In case 3, RTSP server will return an error and an error message will prompt quickly.

In either of these two cases, the error message on UI is the same: "A network error caused the video failed to download." The attachment is the captured screen.
This refers back to the existing error message, nice. Removing late-l10n.

In case you're wondering about the copy in the screenshot, that's bug 892950, fixed on master.
Keywords: late-l10n
Comment on attachment 8367244 [details] [diff] [review]
bug-951188-wip.patch

Review of attachment 8367244 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good: r=me. A couple of minor changes requested to add mNumSelectTimeoutRetries = 0 in two places. Please make those changes and retest before landing.

::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.cpp
@@ +386,5 @@
> +            // and reply an error to RTSPConnectionHandler.
> +            mNumSelectTimeoutRetries = 0;
> +            reply->setInt32("result", -ETIMEDOUT);
> +            reply->post();
> +        }

This looks like a decent timeout mechanism to me. select() waits for 1ms each time it is called, and there are 10,000 retries, so about 10s for a connection timeout. And the thread should never block for too long on select, which seems good too.

Please add a log statement in the else branch.

@@ +391,4 @@
>          return;
>      }
>  
>      int err;

Just to be safe, please set mNumSelectTimeoutRetries = 0 after we have made it through the retry mechanism.

Also, I think you should (re-)initialize it before the first call to onCompleteConnection ... maybe in onConnect?
Attachment #8367244 - Flags: review?(sworkman) → review+
Component: Gaia::Video → Networking
Product: Firefox OS → Core
Version: unspecified → 28 Branch
Attached patch bug-951188-fix.patch (obsolete) — Splinter Review
Fixed the issues mentioned in the review.
Attachment #8367244 - Attachment is obsolete: true
Attachment #8367748 - Flags: review+
(In reply to Steve Workman [:sworkman] from comment #20)
> Comment on attachment 8367244 [details] [diff] [review]
> Just to be safe, please set mNumSelectTimeoutRetries = 0 after we have made
> it through the retry mechanism.
When onCompleteConnection timed out connect, it posts a reply message back to RTSPConnectionHandler. And I added a mConn->disconnect() call in RTSPConnectionHandler for all the cases except for result = OK. So finally all the error cases will call ARTSPConnection::performDisconnect(). I added mNumSelectTimeoutRetries = 0 in this function.

> Also, I think you should (re-)initialize it before the first call to
> onCompleteConnection ... maybe in onConnect?
Yes! Thanks for your reminder. I added it in onConnect() before the first call to onCompleteConnection.
Clearing UX needs info as it seems like the issue has been resolved with the error messages recommended.
Flags: needinfo?(firefoxos-ux-bugzilla)
Added a minor change to improve robustness. In RTSPSource::onDisconnected(), add a check for mLooper and mHandler.
Attachment #8367748 - Attachment is obsolete: true
Attachment #8367801 - Flags: review+
Keywords: checkin-needed
(In reply to Ethan Tseng [:ethan] from comment #18)
> I tested this patch with the following cases:
> 1. Normal case (playing RTSP audio streaming)
> 2. Failed to establish RTSP session (host is unreachable or port is not
> listening)
> 3. RTSP session established, but failed to load the file (such as typo of
> the filename)
> In case 2, an error message will prompt after 10 seconds.
> In case 3, RTSP server will return an error and an error message will prompt
> quickly.

Add one more case:
Case 4: Turn off Wifi while RTSP streaming is being played.
The network error message will prompt immediately.
https://hg.mozilla.org/mozilla-central/rev/11fe654f16d8
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Keywords: verifyme
Verified fixed on the latest master build
The error message notifies user about the broken link

Device: Buri 1.4 MOZ
BuildID: 20140219040204
Gaia: ac06cfbd2baf6494ffbb668cc599e3892cd5e17b
Gecko: bf0e76f2a7d4
Version: 30.0a1
Firmware Version: v1.2-device.cfg
Thanks for the help.
Good Job! Video app pops up a warning message

* Build Information:
 - Gaia      a43904d9646109b48836d62f7aa51e499fbf4b2e
 - Gecko     https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/32fd5d798477
 - BuildID   20140219164003
 - Version   28.0

* Test Result:
 - Cannot reproduce

Also, attaching the screenshot (2014-02-20-21-13-38.png). FYI.
Status: RESOLVED → VERIFIED
Attached image 2014-02-20-21-13-38.png
Jason, does this need any testing on Desktop?
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #33)
> Jason, does this need any testing on Desktop?

Nope - this only applies to FxOS.
(In reply to Jason Smith [:jsmith] from comment #34)
> Nope - this only applies to FxOS.

Thanks Jason. I'm marking this [qa-] for Desktop Firefox since it's already been marked verified for FxOS.
Keywords: verifyme
Whiteboard: [FT:RIL] → [FT:RIL][qa-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: