Closed Bug 1050461 Opened 5 years ago Closed 5 years ago

H.264 RTP packetization with multiple NALUs per frame

Categories

(Core :: WebRTC: Audio/Video, defect)

33 Branch
x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox32 --- wontfix
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: mzanaty, Assigned: mzanaty)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

Attached patch h264rtp.diff (obsolete) — Splinter Review
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.77.4 (KHTML, like Gecko) Version/7.0.5 Safari/537.77.4

Steps to reproduce:

Firefox (33+) WebRTC video call fails to decode H.264 video from non-Firefox peers when they send multiple H.264 NAL units per frame (with the same RTP timestamp). Packet loss, reordering, or duplication scenarios can also sometimes cause the same issue between two Firefox peers.


Actual results:

No incoming H.264 video rendered in Firefox. Outgoing H.264 video from Firefox renders fine on non-Firefox WebRTC peers.


Expected results:

Incoming H.264 video should render in Firefox from non-Firefox WebRTC peers.

Attached patch changes the H.264 RTP receiver to handle multiple NAL units per frame with the same timestamp. A single jitter buffer frame is created for all packets with the same RTP timestamp. NAL units are depacketized upon insertion to the encoded frame buffer. Depacketization includes insertion of start codes and removal of fragmentation and aggregation unit headers.
Attachment #8469502 - Attachment is patch: true
Attachment #8469502 - Attachment mime type: text/x-patch → text/plain
Attachment #8469502 - Flags: review?(rjesup)
Thanks for the patch, Mo!
Assignee: nobody → mzanaty
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 8469502 [details] [diff] [review]
h264rtp.diff

Review of attachment 8469502 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with nits; please update and re-upload.  Make sure your ~/.hgrc knows your user name

::: media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.h
@@ +70,5 @@
> +  };
> +  enum StartCodePrefix { // H.264 Annex B format {0,0,0,1}
> +    kStartCodeSize = 4 // 4 byte prefix before each NAL header
> +  };
> +  

trailing space

::: media/webrtc/trunk/webrtc/modules/video_coding/main/source/session_info.cc
@@ +118,5 @@
>  
> +void VCMSessionInfo::InsertStartCode(bool insert_start_code, uint8_t* dst,
> +                                     const uint8_t* src, size_t len) {
> +  if (insert_start_code) {
> +    dst[0]=0; dst[1]=0; dst[2]=0; dst[3]=1;

Either clear kStartCodeSize-1 bytes then insert a 1, or assert that kStartCodeSize is 4

@@ +120,5 @@
> +                                     const uint8_t* src, size_t len) {
> +  if (insert_start_code) {
> +    dst[0]=0; dst[1]=0; dst[2]=0; dst[3]=1;
> +  }
> +  memcpy(dst+(insert_start_code ? RtpFormatH264::kStartCodeSize : 0),src,len);

spaces after commas

@@ +122,5 @@
> +    dst[0]=0; dst[1]=0; dst[2]=0; dst[3]=1;
> +  }
> +  memcpy(dst+(insert_start_code ? RtpFormatH264::kStartCodeSize : 0),src,len);
> +}
> +  

delete-trailing-spaces (emacs) or equivalent

@@ +136,5 @@
> +    // Calculate extra packet size needed for adding start codes,
> +    // and removing fragmentation and aggregation unit headers.
> +    size_t nalu_size, all_nalu_size = 0;
> +    const uint8_t* nalu_ptr = packet.dataPtr;
> +    uint8_t nal_header = *nalu_ptr, fu_header;

separate lines

@@ +152,5 @@
> +          frame_buffer[RtpFormatH264::kStartCodeSize] = nal_header;
> +          packet.sizeBytes += RtpFormatH264::kStartCodeSize;
> +          packet.dataPtr = frame_buffer;
> +          packet.completeNALU = kNaluStart;
> +        } else {

There's duplicated code between the start and non-start, but I don't think it's worth the internal complexity to merge them.

@@ +203,5 @@
> +        break;
> +    } // switch nal_type
> +  } else { // not H.264
> +    ShiftSubsequentPackets(packet_it, packet.sizeBytes);
> +    InsertStartCode(false, frame_buffer, packet.dataPtr, packet.sizeBytes);

Does non-H.264 really want H.264 start codes?  No, given passing 'false' here, which means it's really "copy buffer" and poorly named.  Let's rename it, it's really "copy data with optional start-code insertion".

@@ +502,5 @@
>    }
> +  
> +  // Track the marker bit, should only be set for one packet per session.
> +  if (packet.markerBit && last_packet_seq_num_ == -1) {
> +    last_packet_seq_num_ = static_cast<int>(packet.seqNum);

This is probably ok, though I cringe a little at the overall handling of the Marker bit in the upstream code.
Attachment #8469502 - Flags: review?(rjesup) → review+
(In reply to Randell Jesup [:jesup] from comment #2)
> trailing space
> delete-trailing-spaces (emacs) or equivalent
> spaces after commas

Done

> > +    dst[0]=0; dst[1]=0; dst[2]=0; dst[3]=1;
> Either clear kStartCodeSize-1 bytes then insert a 1...

Done

> > +    size_t nalu_size, all_nalu_size = 0;
> > +    const uint8_t* nalu_ptr = packet.dataPtr;
> > +    uint8_t nal_header = *nalu_ptr, fu_header;
> separate lines

Done

> > +          packet.completeNALU = kNaluStart;
> There's duplicated code between the start and non-start, but I don't think
> it's worth the internal complexity to merge them.

There are just enough differences that merging into common but parameterized functions adds complexity and overhead. I started there, it was uglier, so I "unrolled" it.

> > +  } else { // not H.264
> > +    ShiftSubsequentPackets(packet_it, packet.sizeBytes);
> > +    InsertStartCode(false, frame_buffer, packet.dataPtr, packet.sizeBytes);
> Does non-H.264 really want H.264 start codes?  No, given passing 'false'
> here, which means it's really "copy buffer" and poorly named.  Let's rename
> it, it's really "copy data with optional start-code insertion".

Agreed. Done.

> > +  // Track the marker bit, should only be set for one packet per session.
> > +  if (packet.markerBit && last_packet_seq_num_ == -1) {
> > +    last_packet_seq_num_ = static_cast<int>(packet.seqNum);
> This is probably ok, though I cringe a little at the overall handling of the
> Marker bit in the upstream code.

Agreed. It may be worth a future fix to clean up marker and timestamp handling in general. Like only rely on the marker bit for early end-of-frame detection once it proved reliable (next seq# has next timestamp). If unreliable, fallback to waiting for the next timestamp before declaring end-of-frame. I had a quick look, and this appears non-trivial in upstream code due to the rampant creation/destruction of packets, frames, and frame lists. Marker state would have to go into longer-lived objects. Good future to-do though.
New patch per review from Randell Jesup.
Attachment #8469502 - Attachment is obsolete: true
Comment on attachment 8470410 [details] [diff] [review]
h264rtpdiffv2.txt

Review of attachment 8470410 [details] [diff] [review]:
-----------------------------------------------------------------

Normally the sequence would be to resolve the nits, upload a ready-for-checkin patch with the first line of description in a format similar to what is seen at https://tbpl.mozilla.org/?tree=Mozilla-Inbound ("Bug XXXXX: description r=xxxxx" - you can add additional lines of descriptive text, though most people don't.  Then mark it r+ with a comment "carrying forward r=xxxx", and add Keyword "checkin-needed" to the bug, and within a day or so a sheriff will check it in.  In this case, I'll just resolve the nits and land it.

Thanks!

::: media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.h
@@ +38,5 @@
> +  // This supports H.264 RTP packetization modes 0/1 from RFC 6184
> +  // Single NALU: NAL Header (1 byte), Data...
> +  // FU-A   NALU: NAL Header, FU Header (1 byte), Data...
> +  // STAP-A NALU: NAL Header, Length1 (2 bytes), Data1, Length2, Data2...
> +  

spaces

::: media/webrtc/trunk/webrtc/modules/video_coding/main/source/session_info.cc
@@ +120,5 @@
> +}
> +
> +void VCMSessionInfo::CopyWithStartCode(uint8_t* dst, const uint8_t* src, size_t len) {
> +  // H.264 Start Code is 2 or more bytes of 0 followed by 1 byte of 1.
> +  memset(dst, 0, RtpFormatH264::kStartCodeSize);

RtpFormatH264::kStartCodeSize -1

@@ +485,4 @@
>      return -2;
>  
>    PacketIterator packet_list_it;
> +  

trailing spaces in a few places
Attachment #8470410 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/a246e66e8e53
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [openh264-uplift]
Comment on attachment 8470410 [details] [diff] [review]
h264rtpdiffv2.txt

Note: should be paired with bug 1051566, which enables use of Mode 0.

Approval Request Comment
[Feature/regressing bug #]:N/A

[User impact if declined]: No support for H.264 mode 0, which is required by IETF 6184.  Most equipment doesn't use mode 0, but some equipment people care about does.  Also, this fixes support for non-picture-coding H.264 packets (SEI and the like), which is perhaps even more important since you can't negotiate them away.

[Describe test coverage new/current, TBPL]: On m-c since the 10th.  Considerable manual testing by Cisco, and manual testing.  I added a checkbox to force Mode 0 to our primary test page (http://mozilla.github.com/webrtc-landing/pc_test.html) We can't test H.264 OMX using a real codec in TBPL.  Note: most of the code changes are tested by both Mode 0 and Mode 1.

[Risks and why]: moderate risk - larger number of code changes.  Almost all risk contained to H.264 (both mode 0 and 1), and would generally manifest itself as inability to show decode (to show video).

[String/UUID change made/needed]: none
Attachment #8470410 - Flags: approval-mozilla-aurora?
Attachment #8470410 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
This hits conflicts on Aurora due to bug 1045468 not being present. Please rebase.
Flags: needinfo?(mzanaty)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #9)
> This hits conflicts on Aurora due to bug 1045468 not being present. Please
> rebase.

We actually want that one as well, and it's trivial/no-risk.  aurora? requested.
(In reply to Randell Jesup [:jesup] from comment #10)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #9)
> > This hits conflicts on Aurora due to bug 1045468 not being present. Please
> > rebase.
> We actually want that one as well, and it's trivial/no-risk.  aurora?
> requested.

Uplifting 1045468 first will help to auto-merge this patch. But this patch replaces the code 1045468 adds, so if hand merging you can ignore 1045468 entirely.
Flags: needinfo?(mzanaty)
Whiteboard: [openh264-uplift]
We are facing a similar issue when trying to play a H.264 rtp stream with encoded H.264 having multiple NALUs/frame. Can you please provide us a pcap of a session with H.264 multiple NALUs/frame?
(In reply to SignalStrength from comment #13)
> We are facing a similar issue when trying to play a H.264 rtp stream with
> encoded H.264 having multiple NALUs/frame. Can you please provide us a pcap
> of a session with H.264 multiple NALUs/frame?

We aren't set up to help people debug external applications.
You need to log in before you can comment on or make changes to this bug.