Closed Bug 1022524 Opened 10 years ago Closed 10 years ago

Set up audio track and video track information for Ogg files

Categories

(Core :: Audio/Video, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: shelly, Assigned: kikuo)

References

Details

(Whiteboard: [FT:Stream3])

Attachments

(2 files, 15 obsolete files)

273.11 KB, patch
kikuo
: review+
Details | Diff | Splinter Review
32.42 KB, patch
kikuo
: review+
Details | Diff | Splinter Review
Follow-up from bug 744896, comment 80. For now we are hard coding the info of AudioTrack and VideoTrack in MediaInfo.h, we might want to parse out the real info for media tracks base on specific codec decoder.
Depends on: 744896
Once we have at least one of our supported codec implements the track information, then we can turn the track interface on by default, I think.
Blocks: 1026351
blocking-b2g: --- → 2.1?
Hello, I'd like to take this issue, and already working on it. One question, 1) to extract information and set to Mediatracks in MediaDecoder, and 2) Make MediaDecoder be able to select Audio/Video Track by Index. Are we gonna do only 1 or 1+2 for this issue? or it's up to me @@ ? BTW, I'd choose Ogg file format as the first modification. Just because it's defined more clearly in [1]. [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#dom-audiotrack-kind
Flags: needinfo?(slin)
I don't think we need to make MediaDecoder able to select video/audio track at this point, MediaDecoder can always chose the "main" track of video/audio. Ogg is a good choice, but be ware of the chaining in Ogg :)
Flags: needinfo?(slin)
Whiteboard: [FT:Stream3]
Attached patch mt_patch1.diff (obsolete) — Splinter Review
Hi Shelly, Just attached the patch that i recently worked on. I wondering that if you could give me some feedback about it. Basically, the main idea is to 1) Extract Theora/Vorbis/Opus bitstream setup procedure as single function, so that it would be easier to support switching dynamically. 2) Parse fisbone header information if exists, and store in mSkeletonState. Then assign these information to target Audio/VideoInfo array 3) Adopt new MediaInfo structure in all other codec readers. And, any feedback is appreciated, thanks :)
Assignee: nobody → kikuo
Status: NEW → ASSIGNED
Attachment #8458766 - Flags: feedback?(slin)
Attached patch mt_patch2.diff (obsolete) — Splinter Review
Same as Comment 4, just fix few indentation diff.
Attachment #8458766 - Attachment is obsolete: true
Attachment #8458766 - Flags: feedback?(slin)
Attachment #8458778 - Flags: feedback?(slin)
Comment on attachment 8458778 [details] [diff] [review] mt_patch2.diff Review of attachment 8458778 [details] [diff] [review]: ----------------------------------------------------------------- Could you separate this patch into two bugs? One for setting up the track information in MediaInfo(the part you did with Ogg files), and the other for supporting multi-tracks.
Attachment #8458778 - Flags: feedback?(slin) → feedback-
blocking-b2g: 2.1? → ---
blocking-b2g: --- → 2.1?
Summary: Set up track information for AudioTrack and VideoTrack per different codec decoder → Set up audio track and video track information for Ogg files
blocking-b2g: 2.1? → ---
Attached patch mt_setup_ogg_only.diff (obsolete) — Splinter Review
Hello Shelly, As your Comment 6, I extracted the part of setting up track information for Ogg format into new patch "mt_setup_ogg_only.diff" And I'll create another bug for supporting multi-tracks operation for all codecs. Thanks for your suggestion.
Attachment #8459459 - Flags: feedback?(slin)
Per Comment 7, new created Bug 1041438 to support multi-tracks operation for all codecs.
Comment on attachment 8459459 [details] [diff] [review] mt_setup_ogg_only.diff Review of attachment 8459459 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/MediaInfo.h @@ +12,5 @@ > #include "nsString.h" > > namespace mozilla { > > +enum MediaTrackCategory{ Space before {. ::: content/media/ogg/OggCodecState.cpp @@ +1138,3 @@ > static bool IsSkeletonBOS(ogg_packet* aPacket) > { > return aPacket->bytes >= SKELETON_MIN_HEADER_LEN && Please take off the whitespace even it wasn't left by you, thanks :) @@ +1382,5 @@ > } > > +bool SkeletonState::DecodeFisbone(ogg_packet* aPacket) > +{ > + uint32_t offsetToMsgHeaderField = LittleEndian::readUint32(aPacket->packet + FISBONE_OFFSET_TO_MESSAGE_HEADER_FIELDS_OFFSET); Although it is not really a must, but please try restricting you line in 80-char as much as possible. @@ +1426,5 @@ > + pToMessage++; > + }; > + > + std::map<uint32_t, MessageFieldInfo>::iterator iter; > + iter = mMapSerial2MsgFieldInfo.find(serialno); Same here, use mozilla modules unless you really can't find a matching one. ::: content/media/ogg/OggCodecState.h @@ +384,5 @@ > > +// Stores the message information for different logical bitstream. > +typedef struct stMessageFieldInfo { > + // serialno of bitstream > + uint32_t mSerialno; Please use single space. @@ +401,5 @@ > public: > SkeletonState(ogg_page* aBosPage); > ~SkeletonState(); > + > + std::map<uint32_t, MessageFieldInfo> mMapSerial2MsgFieldInfo; Try using nsClassHashtable instead of std modules. @@ +472,5 @@ > > // Decodes an index packet. Returns false on failure. > bool DecodeIndex(ogg_packet* aPacket); > + // Decodes an fisbone packet. Returns false on failure. > + bool DecodeFisbone(ogg_packet* aPacket); Don't indent the declaration. ::: content/media/ogg/OggReader.cpp @@ +86,5 @@ > // is about 4300 bytes, so we read the file in chunks larger than that. > static const int PAGE_STEP = 8192; > > +// Ogg Skeleton info to media-element trackList-getKind-categories > +static MediaTrackCategory GetCategoryEnumFromOggRoleInfo(const nsCString& aRole) Why do you need to convert aRole into an enum of category first, and then convert that enum of category to the kind of TrackInfo? Can you use a "category parser" and story the result into aKind directly? ::: content/media/ogg/OggReader.h @@ +265,5 @@ > + void SetupTargetTheora(TheoraState* aTheoraState); > + void SetupTargetVorbis(VorbisState* aVorbisState); > + void SetupTargetOpus(OpusState* aOpusState); > + void SetupTargetSkeleton(SkeletonState* aSkeletonState); > + void SetupMediaTracksInfo(const nsTArray<uint32_t>& aSerailArray); s/aSerailArray/aSerails
> ::: content/media/ogg/OggReader.cpp > @@ +86,5 @@ > > // is about 4300 bytes, so we read the file in chunks larger than that. > > static const int PAGE_STEP = 8192; > > > > +// Ogg Skeleton info to media-element trackList-getKind-categories > > +static MediaTrackCategory GetCategoryEnumFromOggRoleInfo(const nsCString& aRole) > > Why do you need to convert aRole into an enum of category first, and then > convert that enum of category to the kind of TrackInfo? Can you use a > "category parser" and story the result into aKind directly? Sure, that's ok :) FYI, I separated the function into 2 parts is because the parsed text information for Category(Kind) is different among formats. E.g., Ogg: "audio/main", WebM: "FlagDefault", DASH: "main", These all stand for Category "main". That's why I implement a function GetKindFromCategoryEnum() which is originally placed in MediaInfo.h for all codecs.
Hi Shelly, Regarding "category parser", I plan to do like the following pseudo code in OggReader.cpp. ============================================= 1. MediaTrackCategory OggCateReader(const nsCString& aRole) { // Parse category from OggRole Info. } 2. typedef MediaTrackCategory CategoryParser(const nsCString& aContent); 3. static void GetKindFromCategoryParser(CategoryParser aParser, const nsCString& aContent, nsAString& aKind) { MediaTrackCategory cat = aParser(aContent); switch (cat): // return final category string to aKind. } ============================================= In this case, if we're going to support more formats, we only need to implement each codec's own OOOCategoryParser, and move 2&3 to MediaInfo.h w/o a line change. And we still cant call GetKindFromCategoryParser() to store aKind directly. What do you think ?
Flags: needinfo?(slin)
(In reply to Kilik Kuo [:kilikkuo] from comment #11) > Hi Shelly, > > Regarding "category parser", > I plan to do like the following pseudo code in OggReader.cpp. > ============================================= > 1. > MediaTrackCategory OggCateReader(const nsCString& aRole) > { // Parse category from OggRole Info. } > 2. > typedef MediaTrackCategory CategoryParser(const nsCString& aContent); > 3. > static void GetKindFromCategoryParser(CategoryParser aParser, const > nsCString& aContent, nsAString& aKind) GetKind() is good enough. > { > MediaTrackCategory cat = aParser(aContent); > switch (cat): > // return final category string to aKind. > } > ============================================= > In this case, if we're going to support more formats, > we only need to implement each codec's own OOOCategoryParser, and move 2&3 > to MediaInfo.h w/o a line change. And we still cant call > GetKindFromCategoryParser() to store aKind directly. > > What do you think ? As we talked over skype.
Flags: needinfo?(slin)
Attached patch patch_comment_9.diff (obsolete) — Splinter Review
Attachment #8460100 - Flags: feedback?(slin)
Attachment #8459459 - Attachment is obsolete: true
Attachment #8459459 - Flags: feedback?(slin)
Attachment #8458778 - Attachment is obsolete: true
Comment on attachment 8460100 [details] [diff] [review] patch_comment_9.diff Hello Shelly, I've refined the patch according to Comment 9, Please have a look, and see if there's any inappropriate part. Thanks for your time and feedback.
Comment on attachment 8460100 [details] [diff] [review] patch_comment_9.diff Review of attachment 8460100 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/ogg/OggCodecState.cpp @@ +9,5 @@ > #include "mozilla/DebugOnly.h" > #include "mozilla/Endian.h" > #include <stdint.h> > > +#include "nsCRTGlue.h" What is this include for? @@ +1131,5 @@ > static const size_t INDEX_LAST_NUMER_OFFSET = 34; > static const size_t INDEX_KEYPOINT_OFFSET = 42; > > +// Byte-offsets of the fields in the Skeleton Fisbone packet > +static const size_t FISBONE_OFFSET_TO_MSG_FIELDS = 8; s/FISBONE_OFFSET_TO_MSG_FIELDS/FISBONE_MSG_FIELDS_OFFSET @@ +1387,5 @@ > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField); > + > + char* pToMessage = (char*)aPacket->packet + FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField; > + char* pStart = pToMessage; What are pToMessage and pStart mean? @@ +1388,5 @@ > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField); > + > + char* pToMessage = (char*)aPacket->packet + FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField; > + char* pStart = pToMessage; > + nsAutoPtr<MessageFieldInfo> stMsgInfo(new MessageFieldInfo()); s/stMsgInfo/field @@ +1391,5 @@ > + char* pStart = pToMessage; > + nsAutoPtr<MessageFieldInfo> stMsgInfo(new MessageFieldInfo()); > + > + while (msgFieldLen) > + { { at the end of while statement. @@ +1393,5 @@ > + > + while (msgFieldLen) > + { > + if (*pToMessage == '\r' && *(pToMessage+1) == '\n') > + { { at the end of if statement. @@ +1397,5 @@ > + { > + // Message Header Fields are supposed to follow [RFC2822], which should be US-ASCII encoded. > + nsAutoCString strMsg(pStart, pToMessage-pStart); > + if ((strMsg.Find("Content-Type:")) != -1) { > + stMsgInfo->mContentType.Assign(pStart+13, pToMessage-pStart-13); If you drop the init of strings in struct declaration, then you can do stMsgInfo->mContentType = nsCString(buf, len); ::: content/media/ogg/OggCodecState.h @@ +382,5 @@ > // version numbers. > #define SKELETON_VERSION(major, minor) (((major)<<16)|(minor)) > > +// Stores the message information for different logical bitstream. > +typedef struct stMessageFieldInfo { s/stMessageFieldInfo/_MessageField @@ +393,5 @@ > + mTitle(""), > + mDisplayHint(""), > + mTrackOrder(""), > + mAltitude(""), > + mTrackDependencies("") I don't think you need to have all your strings constructed here, you can do lazy-constructing at assigning, or what's the reason of constructed them at declaration? @@ +405,5 @@ > + nsCString mDisplayHint; > + nsCString mTrackOrder; > + nsCString mAltitude; > + nsCString mTrackDependencies; // (Not clarified yet) > +} MessageFieldInfo; s/MessageFieldInfo/MessageField ::: content/media/ogg/OggReader.cpp @@ +85,5 @@ > // Chunk size to read when reading Ogg files. Average Ogg page length > // is about 4300 bytes, so we read the file in chunks larger than that. > static const int PAGE_STEP = 8192; > > +// According following specification, return corresponding category nit: whitespace. // Return the corresponding category in aKind base on the following spec. // (your link here) @@ +87,5 @@ > static const int PAGE_STEP = 8192; > > +// According following specification, return corresponding category > +// http://dev.w3.org/html5/spec-preview/media-elements.html#dom-audiotrack-kind > +static void GetKind(const nsCString& aRole, nsAString& aKind) { { in new line. @@ +257,5 @@ > +} > + > +void OggReader::SetupTargetSkeleton(SkeletonState* aSkeletonState) > +{ > + // Setup skeleton related information after mVorbisState & mTheroState being set (if they exist). 80-char line please. @@ +291,5 @@ > + } > + > + if (codecState->GetType() == OggCodecState::TYPE_THEORA) { > + // if main, setup message information if it has > + // if not, setup original information and add to array Sorry....I don't get your comment here. @@ +299,5 @@ > + if (msgInfo) { > + GetKind(msgInfo->mRole, videoInfo.mTrackInfo.mKind); > + videoInfo.mTrackInfo.mLanguage.AssignWithConversion(msgInfo->mLanguage); > + videoInfo.mTrackInfo.mId.AssignWithConversion(msgInfo->mName); > + videoInfo.mTrackInfo.mEnabled = (mTheoraState == theoraState)? true:false; Can you use the Init() function of mTrackInfo? @@ +315,5 @@ > + videoInfo.mHasVideo = IsValidVideoRegion(frameSize, picture, displaySize)? true:false; > + > + // Currently only set up primary track information. > + // TODO: Once MediaInfo support VideoInfo array, we can insert primary > + // videoInfo at 0 index and appendElement for other videoInfo in else part. Currently set up the information of primary track only. TODO: Once MediaInfo supports multiple TrackInfo, we can then append the rest videoInfo to the array of track infos. @@ +320,5 @@ > + if (mTheoraState && mTheoraState->mSerial == theoraState->mSerial) { > + mInfo.mVideo = videoInfo; > + } else { > + continue; > + } Can you move the above if statement below where you get theoraState? Seems like you are doing all the hard works no matter what, but by-passing the other non-primary ones at the end. @@ +344,5 @@ > + if (mVorbisState && mVorbisState->mSerial == vorbisState->mSerial) { > + mInfo.mAudio = audioInfo; > + } else { > + continue; > + } Same as above. @@ +369,5 @@ > + if (mOpusState && mOpusState->mSerial == opusState->mSerial) { > + mInfo.mAudio = audioInfo; > + } else { > + continue; > + } Same here. @@ +426,5 @@ > > // We've read all BOS pages, so we know the streams contained in the media. > + // 1. Process all available header packets in the Theora, Vorbis/Opus bitstreams. > + // 2. Find the first encountered Theora/Vorbis/Opus bitstream, and configure > + // it as target A/V bitstream. it as the target A/V bitstream. @@ +428,5 @@ > + // 1. Process all available header packets in the Theora, Vorbis/Opus bitstreams. > + // 2. Find the first encountered Theora/Vorbis/Opus bitstream, and configure > + // it as target A/V bitstream. > + // 3. Deactivate bitstream of each type(Theora/Vorbis/Opus) that is not > + // encountered in the first place. 3. Deactivate the rest of bitstreams for now, until we have MediaInfo support multiple track infos. p.s. This is my guessing, please correct me if I'm wrong. @@ +467,5 @@ > } > } > > + // Setup skeleton related information after mVorbisState or mTheroState or > + // mOpusState being set (if they exist). What if mVorbisState and others are not initialized yet? @@ +469,5 @@ > > + // Setup skeleton related information after mVorbisState or mTheroState or > + // mOpusState being set (if they exist). > + // Because skeleton bitstream is always encountered and parsed first, so > + // don't put it in the for loop above What for loop are you referring to? That "for loop" will be removed by this patch isn't it? @@ +480,5 @@ > + // 3. Identify the type of logical stream and > + // a) If it has message field, then setup additional Track information. > + // b) Setup original Audio/Vidoe information > + // 4. Check if this bitstream is the first encountered one(main), insert its > + // information to the first place, otherwise, just append to the array. For each serial number 1. Retrieve a codecState from mCodecStore by this serial number. 2. Retrieve a message field from mMsgInfoStore by this serial number. 3. For now, skip if the serial number refers to a non-primary bitstream. 4. Setup track and other audio/video related information per different types. Please move the above details inside your function. And can you ensure the first encountered message field represents the primary bitstream (a.k.a. kind == "main")? @@ +804,5 @@ > } > > + // According to spec, skeleton information should at the begining of a file > + // So, the chained V/A message field information should be stored already if > + // there's any. We should be able to find its message field below. Is there a logic change? I think you can remove the comment if no logic change below. @@ +814,2 @@ > nsAutoPtr<MediaInfo> info(new MediaInfo()); > + // In this case, Video should be the same, only update AudioInfo for now. Is Video the VideoInfo? Or you can remove the comment if no change in logic. @@ +816,5 @@ > if ((newVorbisState && ReadHeaders(newVorbisState)) && > (mVorbisState->mInfo.rate == newVorbisState->mInfo.rate) && > (mVorbisState->mInfo.channels == newVorbisState->mInfo.channels)) { > + > + SetupTargetVorbis(newVorbisState); |SetupTargetVorbis()| copies mVorbisInfo, and null out the mVorbisInfo.codec_setup, is it okay to do the same things here? @@ +821,3 @@ > LOG(PR_LOG_DEBUG, ("New vorbis ogg link, serial=%d\n", mVorbisSerial)); > + > + if (msgInfo) { Move the above declaration of msgInfo here(before the if statement), your declaration of parameters should stay with your implementation as close as possible. @@ +861,5 @@ > { > + mInfo.mAudio.mHasAudio = HasAudio(); > + mInfo.mVideo.mHasVideo = HasVideo(); > + *info = mInfo; > + int rate = mInfo.mAudio.mRate; Can you update the mInfo at once here?
Attachment #8460100 - Flags: feedback?(slin) → feedback-
Hi Shelly, Thanks for your great feedback, really appreciated. (In reply to Shelly Lin [:shelly] from comment #15) > Comment on attachment 8460100 [details] [diff] [review] > patch_comment_9.diff > > Review of attachment 8460100 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: content/media/ogg/OggCodecState.cpp > @@ +9,5 @@ > > #include "mozilla/DebugOnly.h" > > #include "mozilla/Endian.h" > > #include <stdint.h> > > > > +#include "nsCRTGlue.h" > > What is this include for? My bad, I was implemented another parsing algorithm by using NS_strtok, NS_strcmp, and discarded it. But forgot to remove the include header, I'll remove it. > @@ +1387,5 @@ > > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField); > > + > > + char* pToMessage = (char*)aPacket->packet + FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField; > > + char* pStart = pToMessage; > > What are pToMessage and pStart mean? pToMessage is an pointer going through the message field buffer char by char. pStart is an pointer indicating the starting char of each header field(field name: field body\r\n). They could be named in a better way, pToMessage ==> msgIndicator pStart ==> fieldIndicator What do you think ? > @@ +1397,5 @@ > > + { > > + // Message Header Fields are supposed to follow [RFC2822], which should be US-ASCII encoded. > > + nsAutoCString strMsg(pStart, pToMessage-pStart); > > + if ((strMsg.Find("Content-Type:")) != -1) { > > + stMsgInfo->mContentType.Assign(pStart+13, pToMessage-pStart-13); > > If you drop the init of strings in struct declaration, then you can do > stMsgInfo->mContentType = nsCString(buf, len); Good point ! Thank you, I'll try it :) > @@ +291,5 @@ > > + } > > + > > + if (codecState->GetType() == OggCodecState::TYPE_THEORA) { > > + // if main, setup message information if it has > > + // if not, setup original information and add to array > > Sorry....I don't get your comment here. > Oh, my bad @@. Should explain more detail as below. //If the codecState is the primary bitstream, setting up message information for it additionaly. //If the codecState is not the primary bitstream, setting up necessary information and add it to //array (once we support multi-tracks) > @@ +299,5 @@ > > + if (msgInfo) { > > + GetKind(msgInfo->mRole, videoInfo.mTrackInfo.mKind); > > + videoInfo.mTrackInfo.mLanguage.AssignWithConversion(msgInfo->mLanguage); > > + videoInfo.mTrackInfo.mId.AssignWithConversion(msgInfo->mName); > > + videoInfo.mTrackInfo.mEnabled = (mTheoraState == theoraState)? true:false; > > Can you use the Init() function of mTrackInfo? Yes, the idea is good ! Thanks :) > > @@ +315,5 @@ > > + videoInfo.mHasVideo = IsValidVideoRegion(frameSize, picture, displaySize)? true:false; > > + > > + // Currently only set up primary track information. > > + // TODO: Once MediaInfo support VideoInfo array, we can insert primary > > + // videoInfo at 0 index and appendElement for other videoInfo in else part. > > Currently set up the information of primary track only. > TODO: Once MediaInfo supports multiple TrackInfo, we can then append the > rest videoInfo to the array of track infos. Please excuse my poor English. /_\ > > @@ +320,5 @@ > > + if (mTheoraState && mTheoraState->mSerial == theoraState->mSerial) { > > + mInfo.mVideo = videoInfo; > > + } else { > > + continue; > > + } > > Can you move the above if statement below where you get theoraState? Seems > like you are doing all the hard works no matter what, but by-passing the > other non-primary ones at the end. > hmm, yes currently we by-pass other non-primary bitstreams videoInfo. though it's kinda wasting few temporary memory and computation, but IMO, if supporting multi-tacks information is our goal, then the if statement should be kept in the same place for a more general flow. Isn't it ? > @@ +344,5 @@ > > + if (mVorbisState && mVorbisState->mSerial == vorbisState->mSerial) { > > + mInfo.mAudio = audioInfo; > > + } else { > > + continue; > > + } > > Same as above. > > @@ +369,5 @@ > > + if (mOpusState && mOpusState->mSerial == opusState->mSerial) { > > + mInfo.mAudio = audioInfo; > > + } else { > > + continue; > > + } > > Same here. > > @@ +426,5 @@ > > > > // We've read all BOS pages, so we know the streams contained in the media. > > + // 1. Process all available header packets in the Theora, Vorbis/Opus bitstreams. > > + // 2. Find the first encountered Theora/Vorbis/Opus bitstream, and configure > > + // it as target A/V bitstream. > > it as the target A/V bitstream. > > @@ +428,5 @@ > > + // 1. Process all available header packets in the Theora, Vorbis/Opus bitstreams. > > + // 2. Find the first encountered Theora/Vorbis/Opus bitstream, and configure > > + // it as target A/V bitstream. > > + // 3. Deactivate bitstream of each type(Theora/Vorbis/Opus) that is not > > + // encountered in the first place. > > 3. Deactivate the rest of bitstreams for now, until we have MediaInfo > support multiple track infos. > > p.s. This is my guessing, please correct me if I'm wrong. Bingo :) I'll correct it. > > @@ +467,5 @@ > > } > > } > > > > + // Setup skeleton related information after mVorbisState or mTheroState or > > + // mOpusState being set (if they exist). > > What if mVorbisState and others are not initialized yet? > The existence of mVorbistState/mTheroState/mOpusStata will be checked inside SetupTargetSkeleton() via HasAudio()/HasVideo(), then we'll decide to whether deactivate mSkeletonState directly or keep on parsing header page of Skeleton. > @@ +469,5 @@ > > > > + // Setup skeleton related information after mVorbisState or mTheroState or > > + // mOpusState being set (if they exist). > > + // Because skeleton bitstream is always encountered and parsed first, so > > + // don't put it in the for loop above > > What for loop are you referring to? That "for loop" will be removed by this > patch isn't it? Err...that "for loop" refers to line 433 to line 465, it's just above the comment line 469, and I don't know how to describe it @@. Is there any good way in this case ? In addition, this "for loop" won't be removed by this patch. > > @@ +480,5 @@ > > + // 3. Identify the type of logical stream and > > + // a) If it has message field, then setup additional Track information. > > + // b) Setup original Audio/Vidoe information > > + // 4. Check if this bitstream is the first encountered one(main), insert its > > + // information to the first place, otherwise, just append to the array. > > For each serial number > 1. Retrieve a codecState from mCodecStore by this serial number. > 2. Retrieve a message field from mMsgInfoStore by this serial number. > 3. For now, skip if the serial number refers to a non-primary bitstream. > 4. Setup track and other audio/video related information per different types. > > Please move the above details inside your function. > And can you ensure the first encountered message field represents the > primary bitstream (a.k.a. kind == "main")? > Thank you for these detail description, it's great ! The first encountered message field might not belong to primary bitstream. There's a condition "if (mTheoraState && mTheoraState->mSerial == theoraState->mSerial)" to make sure the encountered message field representing the primary bitstream. > @@ +804,5 @@ > > } > > > > + // According to spec, skeleton information should at the begining of a file > > + // So, the chained V/A message field information should be stored already if > > + // there's any. We should be able to find its message field below. > > Is there a logic change? I think you can remove the comment if no logic > change below. > Hmm, no logic change, just wanna explain it in more detail, I'll remove it. > > @@ +816,5 @@ > > if ((newVorbisState && ReadHeaders(newVorbisState)) && > > (mVorbisState->mInfo.rate == newVorbisState->mInfo.rate) && > > (mVorbisState->mInfo.channels == newVorbisState->mInfo.channels)) { > > + > > + SetupTargetVorbis(newVorbisState); > > |SetupTargetVorbis()| copies mVorbisInfo, and null out the > mVorbisInfo.codec_setup, is it okay to do the same things here? > It should be ok. "memcpy(&mVorbisInfo, &aVorbisState->mInfo, sizeof(mVorbisInfo));" Each VorbisState will clean up its mInfo by calling "vorbis_info_clear(&mInfo)" in its destructor. > @@ +861,5 @@ > > { > > + mInfo.mAudio.mHasAudio = HasAudio(); > > + mInfo.mVideo.mHasVideo = HasVideo(); > > + *info = mInfo; > > + int rate = mInfo.mAudio.mRate; > > Can you update the mInfo at once here? I think only "mInfo.mAudio.mHasAudio = HasAudio()" could be removed. Any suggestion ?
Flags: needinfo?(slin)
(In reply to Kilik Kuo [:kilikkuo] from comment #16) > Hi Shelly, > Thanks for your great feedback, really appreciated. > > (In reply to Shelly Lin [:shelly] from comment #15) > > Comment on attachment 8460100 [details] [diff] [review] > > patch_comment_9.diff > > > > Review of attachment 8460100 [details] [diff] [review]: > > ----------------------------------------------------------------- > > @@ +1387,5 @@ > > > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > > > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField); > > > + > > > + char* pToMessage = (char*)aPacket->packet + FISBONE_OFFSET_TO_MSG_FIELDS + offsetMsgField; > > > + char* pStart = pToMessage; > > > > What are pToMessage and pStart mean? > > pToMessage is an pointer going through the message field buffer char by char. > pStart is an pointer indicating the starting char of each header field(field > name: field body\r\n). > > They could be named in a better way, > pToMessage ==> msgIndicator > pStart ==> fieldIndicator > > What do you think ? > I see, can you use "probe" and "head"? > > @@ +291,5 @@ > > > + } > > > + > > > + if (codecState->GetType() == OggCodecState::TYPE_THEORA) { > > > + // if main, setup message information if it has > > > + // if not, setup original information and add to array > > > > Sorry....I don't get your comment here. > > > > Oh, my bad @@. Should explain more detail as below. > //If the codecState is the primary bitstream, setting up message information > for it additionaly. > //If the codecState is not the primary bitstream, setting up necessary > information and add it to > //array (once we support multi-tracks) > Oh I see, I think you can remove this comment since you're repeating it in the "steps". > > @@ +320,5 @@ > > > + if (mTheoraState && mTheoraState->mSerial == theoraState->mSerial) { > > > + mInfo.mVideo = videoInfo; > > > + } else { > > > + continue; > > > + } > > > > Can you move the above if statement below where you get theoraState? Seems > > like you are doing all the hard works no matter what, but by-passing the > > other non-primary ones at the end. > > > > hmm, yes currently we by-pass other non-primary bitstreams videoInfo. > though it's kinda wasting few temporary memory and computation, but IMO, > if supporting multi-tacks information is our goal, then the if > statement should be kept in the same place for a more general flow. > Isn't it ? > If mInfo.mVideo is an array and we want to support multi-tracks, will it change the flow if you move |if (primary) {} else {contune;}| prior? > > @@ +467,5 @@ > > > } > > > } > > > > > > + // Setup skeleton related information after mVorbisState or mTheroState or > > > + // mOpusState being set (if they exist). > > > > What if mVorbisState and others are not initialized yet? > > > > The existence of mVorbistState/mTheroState/mOpusStata will be checked inside > SetupTargetSkeleton() > via HasAudio()/HasVideo(), then we'll decide to whether deactivate > mSkeletonState directly or > keep on parsing header page of Skeleton. > > > @@ +469,5 @@ > > > > > > + // Setup skeleton related information after mVorbisState or mTheroState or > > > + // mOpusState being set (if they exist). > > > + // Because skeleton bitstream is always encountered and parsed first, so > > > + // don't put it in the for loop above > > > > What for loop are you referring to? That "for loop" will be removed by this > > patch isn't it? > > Err...that "for loop" refers to line 433 to line 465, it's just above the > comment line 469, > and I don't know how to describe it @@. Is there any good way in this case ? > In addition, this "for loop" won't be removed by this patch. > Oh okay, I first thought the "for loop" is the loop where it was deactivating any non-primary bitstreams. I think you can remove your comment here since you've described the change above. > > @@ +861,5 @@ > > > { > > > + mInfo.mAudio.mHasAudio = HasAudio(); > > > + mInfo.mVideo.mHasVideo = HasVideo(); > > > + *info = mInfo; > > > + int rate = mInfo.mAudio.mRate; > > > > Can you update the mInfo at once here? > > I think only "mInfo.mAudio.mHasAudio = HasAudio()" could be removed. > Any suggestion ? I'm saying maybe you can setup the new information in the local variable "info", then copy "info" into mInfo. Actually, I'm not 100% sure on this, but there's no need to copy the new value into mInfo since we are not dealing with chained videos, right?
Flags: needinfo?(slin)
Comment on attachment 8460100 [details] [diff] [review] patch_comment_9.diff Review of attachment 8460100 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/ogg/OggReader.cpp @@ +257,5 @@ > +} > + > +void OggReader::SetupTargetSkeleton(SkeletonState* aSkeletonState) > +{ > + // Setup skeleton related information after mVorbisState & mTheroState being set (if they exist). Sorry, my bad, I was using 120% view of screen, gosh the 100% view is so small.
(In reply to Shelly Lin [:shelly] from comment #17) > > > @@ +861,5 @@ > > > > { > > > > + mInfo.mAudio.mHasAudio = HasAudio(); > > > > + mInfo.mVideo.mHasVideo = HasVideo(); > > > > + *info = mInfo; > > > > + int rate = mInfo.mAudio.mRate; > > > > > > Can you update the mInfo at once here? > > > > I think only "mInfo.mAudio.mHasAudio = HasAudio()" could be removed. > > Any suggestion ? > I'm saying maybe you can setup the new information in the local variable > "info", then copy "info" into mInfo. Actually, I'm not 100% sure on this, > but there's no need to copy the new value into mInfo since we are not > dealing with chained videos, right? I'm not quite sure about it, The logic of original code is to notify a new MediaInfo which only contains new audio information to MediaDecoderStateMachine. In fact I don't think it's correct because video information will get lost. And once we support multi-tracks information, it would be better to update information in mInfo first, and copy to the new info then pass to MediaDecoderStateMachine for information consistency. That's why I do a copy of mInfo to the new created one here.
Flags: needinfo?(slin)
(In reply to Kilik Kuo [:kilikkuo] from comment #19) > (In reply to Shelly Lin [:shelly] from comment #17) > > > > > @@ +861,5 @@ > > > > > { > > > > > + mInfo.mAudio.mHasAudio = HasAudio(); > > > > > + mInfo.mVideo.mHasVideo = HasVideo(); > > > > > + *info = mInfo; > > > > > + int rate = mInfo.mAudio.mRate; > > > > > > > > Can you update the mInfo at once here? > > > > > > I think only "mInfo.mAudio.mHasAudio = HasAudio()" could be removed. > > > Any suggestion ? > > I'm saying maybe you can setup the new information in the local variable > > "info", then copy "info" into mInfo. Actually, I'm not 100% sure on this, > > but there's no need to copy the new value into mInfo since we are not > > dealing with chained videos, right? > > I'm not quite sure about it, > The logic of original code is to notify a new MediaInfo which only contains > new audio information to MediaDecoderStateMachine. > In fact I don't think it's correct because video information will get lost. > And once we support multi-tracks information, it would be better to update > information in mInfo first, > and copy to the new info then pass to MediaDecoderStateMachine for > information consistency. > That's why I do a copy of mInfo to the new created one here. Oh! I see ur point here! You're right. Please move the declaration of "info" above where you make the copy, thanks!
Flags: needinfo?(slin)
Attached patch patch_comment_20.diff (obsolete) — Splinter Review
Hi Shelly, Thanks for previous feedback. I learned much for that :) And might need your help again. I've attached a re-worked patch - "patch_comment_20.diff". Could you please have a look and let me know if there's any concern ?
Attachment #8463198 - Flags: feedback?(slin)
Comment on attachment 8463198 [details] [diff] [review] patch_comment_20.diff Review of attachment 8463198 [details] [diff] [review]: ----------------------------------------------------------------- Great work! thank you very much :) ::: content/media/ogg/OggCodecState.cpp @@ +1391,5 @@ > + nsAutoPtr<MessageField> field(new MessageField()); > + > + while (msgFieldLen) { > + if (*msgProbe == '\r' && *(msgProbe+1) == '\n') { > + // Message Header Fields are supposed to follow [RFC2822], which should be US-ASCII encoded. The content of message header fields follows [RFC2822], and is encoded in US-ASCII. ::: content/media/ogg/OggCodecState.h @@ +399,5 @@ > public: > SkeletonState(ogg_page* aBosPage); > ~SkeletonState(); > + > + nsClassHashtable<nsUint32HashKey, MessageField> mMsgInfoStore; s/mMsgInfoStore/mMsgFieldStore ::: content/media/ogg/OggReader.cpp @@ +105,5 @@ > + aKind.AssignLiteral(""); > + } > +} > + > +static void InitTrackInfo(TrackInfo& aTrackInfo, const MessageField* aMsgInfo, bool aEnable) s/aMsgInfo/aMsgField There's an |Init()| function of TrackInfo, can you use it? @@ +307,5 @@ > + > + if (codecState->GetType() == OggCodecState::TYPE_THEORA) { > + // Currently set up the information of primary track only. > + // TODO: Once MediaInfo supports multiple TrackInfo, we can then append the > + // rest videoInfo to the array of track infos. I think you can remove this comment (and below) since you've mentioned it at the beginning already. @@ +457,5 @@ > } > } > > + // Setup skeleton related information after mVorbisState or mTheroState or > + // mOpusState being set (if they exist). Please remove these comments. @@ +468,5 @@ > + // 3. Identify the type of logical stream and > + // a) If it has message field, then setup additional track information. > + // b) Setup original audio/vidoe information. > + // 4. Check if this bitstream is the first encountered one(main), insert its > + // information to the first place, otherwise, just append to the array. Please remove these comments. @@ +837,5 @@ > + mInfo.mAudio.mHasAudio = HasAudio(); > + mInfo.mVideo.mHasVideo = HasVideo(); > + nsAutoPtr<MediaInfo> info(new MediaInfo()); > + *info = mInfo; > + int rate = mInfo.mAudio.mRate; I think it's okay to remove the |rate| now.
Attachment #8463198 - Flags: feedback?(slin) → feedback+
Attached patch patch_comment_22.diff (obsolete) — Splinter Review
Hi Shelly, New patch was attached in reply to Comment 22. Please check if the re-worked part is ok or not ~ Thank you :)
Attachment #8460100 - Attachment is obsolete: true
Attachment #8463198 - Attachment is obsolete: true
Attachment #8464480 - Flags: feedback?(slin)
Attached patch patch_comment_22.diff (obsolete) — Splinter Review
Sorry, wrong patch was attached ... /_\ Correct it with new one.
Attachment #8464480 - Attachment is obsolete: true
Attachment #8464480 - Flags: feedback?(slin)
Attachment #8464490 - Flags: feedback?(slin)
Attached patch patch_comment_22_r1.diff (obsolete) — Splinter Review
Update patch again, remove unnecessary comment.
Attachment #8464490 - Attachment is obsolete: true
Attachment #8464490 - Flags: feedback?(slin)
Attachment #8464513 - Flags: feedback?(slin)
Attached patch patch_comment_22_r2.diff (obsolete) — Splinter Review
remove nit space.
Attachment #8464513 - Attachment is obsolete: true
Attachment #8464513 - Flags: feedback?(slin)
Attachment #8464514 - Flags: feedback?(slin)
Comment on attachment 8464514 [details] [diff] [review] patch_comment_22_r2.diff Review of attachment 8464514 [details] [diff] [review]: ----------------------------------------------------------------- Thank you :)
Attachment #8464514 - Flags: feedback?(slin) → feedback+
Comment on attachment 8464514 [details] [diff] [review] patch_comment_22_r2.diff Hello Pearce, Need your help to review this patch. Thanks a lot :)
Attachment #8464514 - Flags: review?(cpearce)
Comment on attachment 8464514 [details] [diff] [review] patch_comment_22_r2.diff Review of attachment 8464514 [details] [diff] [review]: ----------------------------------------------------------------- Ralph has kindly volunteered to review this.
Attachment #8464514 - Flags: review?(cpearce) → review?(giles)
Unfortunately I can't finish today, and am away tomorrow. I will finish review the day after tomorrow.
(In reply to Ralph Giles (:rillian) from comment #30) > Unfortunately I can't finish today, and am away tomorrow. I will finish > review the day after tomorrow. Hi Giles, Is there any suggestion to the patch ? Looking forward to getting feedback from you and hope that I didn't offend you :)
Comment on attachment 8464514 [details] [diff] [review] patch_comment_22_r2.diff Review of attachment 8464514 [details] [diff] [review]: ----------------------------------------------------------------- Sorry I took so long to get back to you. I'm certainly not offended! I got distracted by other work last week. Thanks for working on ogg support. I don't like this dom API very much, but this is a reasonable approach to implementing it. Please address the comments below about parser implementation and ask for review again. I'd also like to see tests for this in a separate patch. Please also file a bug for filing this structure from comments in the respective media tracks; that's an appropriate fallback for streams without skeleton headers. ::: content/media/ogg/OggCodecState.cpp @@ +1383,5 @@ > +bool SkeletonState::DecodeFisbone(ogg_packet* aPacket) > +{ > + uint32_t offsetMsgField = LittleEndian::readUint32(aPacket->packet + FISBONE_MSG_FIELDS_OFFSET); > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_MSG_FIELDS_OFFSET + offsetMsgField); You need to check for over/underflow here. Either compare before add/subtract, or use mfbt::CheckedInt. This is very important when dealing with data from the network. @@ +1385,5 @@ > + uint32_t offsetMsgField = LittleEndian::readUint32(aPacket->packet + FISBONE_MSG_FIELDS_OFFSET); > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + uint32_t msgFieldLen = aPacket->bytes - (FISBONE_MSG_FIELDS_OFFSET + offsetMsgField); > + > + char* msgProbe = (char*)aPacket->packet + FISBONE_MSG_FIELDS_OFFSET + offsetMsgField; Likewise check here that you're not off the end of the packet data, so you don't have to rely on the code below checking msgFieldLen. @@ +1389,5 @@ > + char* msgProbe = (char*)aPacket->packet + FISBONE_MSG_FIELDS_OFFSET + offsetMsgField; > + char* msgHead = msgProbe; > + nsAutoPtr<MessageField> field(new MessageField()); > + > + while (msgFieldLen) { while (msgFieldLen - 1) {} to account for the lookahead to '\n', right? @@ +1391,5 @@ > + nsAutoPtr<MessageField> field(new MessageField()); > + > + while (msgFieldLen) { > + if (*msgProbe == '\r' && *(msgProbe+1) == '\n') { > + // The content of message header fields follows [RFC2822], and is encoded in US-ASCII. If it's ASCII, please verify this (e.g. IsASCII()) and reject packets that aren't. I suspect it's actually utf-8 though, like vorbis comment tags. Which you'd also need to verify. :) @@ +1395,5 @@ > + // The content of message header fields follows [RFC2822], and is encoded in US-ASCII. > + nsAutoCString strMsg(msgHead, msgProbe-msgHead); > + if ((strMsg.Find("Content-Type:")) != -1) { > + field->mContentType = nsCString(msgHead+13, msgProbe-msgHead-13); > + LOG(PR_LOG_DEBUG, ("Content-Type: %s", field->mContentType.get())); Please put the headers you want to recognize in an array of C string literals. Then you can iterate over them and avoid repeating everything for each one. Use ArrayLength() or strlen() instead of hardcoding the message header length. @@ +1416,5 @@ > + // Not sure what format of string will be returned (Not clear in specification) > + } else if (strMsg.Find("Track dependencies:") != -1) { > + // Not sure what format of string will be returned (Not clear in specification) > + } > + msgHead = msgProbe+2; It's worth adjusting msgFieldLen and msgProbe here too, if only so I don't have to worry about what happens with msgHead > msgProbe. ::: content/media/ogg/OggReader.cpp @@ +86,5 @@ > // is about 4300 bytes, so we read the file in chunks larger than that. > static const int PAGE_STEP = 8192; > > +// Return the corresponding category in aKind based on the following spec. > +// (http://dev.w3.org/html5/spec-preview/media-elements.html#dom-audiotrack-kind) Please cite the whatwg.org version instead. We prefer that spec where possible. E.g. https://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#dom-audiotrack-kind @@ +95,5 @@ > + } else if (aRole.Find("audio/alternate") != -1 || > + aRole.Find("video/alternate") != -1) { > + return NS_LITERAL_STRING("alternative"); > + } else if (aRole.Find("audio/audiodesc") != -1) { > + return NS_LITERAL_STRING("description"); whatwg spec has 'descriptions' here. @@ +199,5 @@ > } > } > > +void OggReader::SetupTargetTheora(TheoraState* aTheoraState) > +{ Do you need to mTheoraState->Reset() here like you do for the audio codecs?
Attachment #8464514 - Flags: review?(giles) → review-
Hello Giles, Thank you. These suggestions are helpful to me and I've done several corrections according to Comment 32. One question, Regarding the modification for iterating headers in an array to find the target header name. Because respective header value needs be filled into corresponding member variable in struct |MessageField|, another function |GetMsgFieldMemberPtr| is added to make the code more neat inside while loop. I'm not quite sure if this is what you meant. Maybe I misunderstand something. Since there will be correlation between the array and the newly added function, and I don't think it's good. Would you mind explaining more ? Thanks ~ And also, a bug is created. Bug 1053717 - OggCodec should be responsible for initializing media track information for Ogg files.
Attachment #8472911 - Flags: review?(giles)
Currently, I choose 4 cases. Case 1 - File with skeleton header (version 4) Case 2 - File with skeleton header (version 3) Case 3 - File without skeleton header (only Theora bitstream) Case 4 - File without skeleton header (only Vorbis bitstream) I only found ogv files which contains field "Content Type", "Name" & "Role" in their headers. Hence, I hard coded the parsing result to test correctness. Not sure if this is a good way to test :(
Attachment #8464514 - Attachment is obsolete: true
Attachment #8472915 - Flags: review?(giles)
Comment on attachment 8472911 [details] [diff] [review] Part1 - Parsing ogg fisbone and fill into audiotrack/videotrack.diff Review of attachment 8472911 [details] [diff] [review]: ----------------------------------------------------------------- Better, except for my bad table advice. :) Please address comments and submit for review again. ::: content/media/ogg/OggCodecState.cpp @@ +1137,2 @@ > static bool IsSkeletonBOS(ogg_packet* aPacket) > { While we're here, please add static_assert(SKELETON_MIN_HEADER_LEN >= 8); here to verify we won't make an uninitialized read. @@ +1146,5 @@ > memcmp(reinterpret_cast<char*>(aPacket->packet), "index", 5) == 0; > } > > +static bool IsSkeletonFisbone(ogg_packet* aPacket) > +{ static_assert(SKELETON_MIN_FISBONE_LEN >= 8); @@ +1404,5 @@ > + return &aMsgField->mTrackDependencies; > + } > + } > + return nullptr; > +} No, that wasn't what I meant. Sorry for the confusion. I guess the problem is you're mapping strings to method names. You can do that with a C preprocessor macro. Or you could make MessageField itself a hash table. Your C string array would still limit the recognized field names and a second column could still map them to key names from the spec. @@ +1413,5 @@ > + CheckedUint32 serialno = CheckedUint32(LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET)); > + CheckedUint32 msgFieldLen = CheckedUint32(aPacket->bytes) - (FISBONE_MSG_FIELDS_OFFSET + offsetMsgField); > + if (!msgFieldLen.isValid() || !offsetMsgField.isValid() || !serialno.isValid()) { > + return false; > + } Yay CheckedInt. Some tricks here: I think you can use the implicit copy contructor on initialization. Try CheckedUint32 offsetMsgField = LittleEndian::readUint32(...); to avoid repeating the class name. You have to create the object to get checked arithmetic. CheckedInt(a + b) does normal arithmetic on a + b, and then makes a checked int from the result whether it overflowed or not. So isValid() will always succeed for the first two. However, I was really only worried about arithmetic overflow in the msgFieldLen calculation. For offsetMsgField and serialno, you still need to check the read address against aPacket->bytes before you actually read from those addresses. Likewise make sure msgProbe comes out between the start and the end of the packet data. @@ +1446,5 @@ > + } > + } > + msgHead = msgProbe+2; > + msgProbe = msgHead; > + msgLength-=2; Put the msgHead and msgLength updates together since they track the same thing. ::: content/media/ogg/OggReader.cpp @@ +101,5 @@ > + } else if (aRole.Find("video/sign") != -1) { > + return NS_LITERAL_STRING("sign"); > + } else if (aRole.Find("audio/dub") != -1) { > + return NS_LITERAL_STRING("translation"); > + } I think it's worth adding a mapping from "audio/commentary" to "commentary" here, since such tracks are fairly popular. http://wiki.xiph.org/SkeletonHeaders#Role seems to the be spec list for the skeleton side. @@ +102,5 @@ > + return NS_LITERAL_STRING("sign"); > + } else if (aRole.Find("audio/dub") != -1) { > + return NS_LITERAL_STRING("translation"); > + } > + return NS_LITERAL_STRING(""); EmptyString() @@ +203,5 @@ > +void OggReader::SetupTargetTheora(TheoraState* aTheoraState) > +{ > + if (mTheoraState) { > + mTheoraState->Reset(); > + } Please remove trailing whitespace. @@ +290,5 @@ > + // 1. Retrieve a codecState from mCodecStore by this serial number. > + // 2. Retrieve a message field from mMsgFieldStore by this serial number. > + // 3. For now, skip if the serial number refers to a non-primary bitstream. > + // 4. Setup track and other audio/video related information per different types. > + for (uint32_t i = 0; i < aSerials.Length(); i++) { nsTArray index and length are now size_t.
Attachment #8472911 - Flags: review?(giles) → review-
Comment on attachment 8472915 [details] [diff] [review] Part 2 - Test case for parsing Ogg skeleton fisbone header Review of attachment 8472915 [details] [diff] [review]: ----------------------------------------------------------------- Please address comments and let me see it again. > I only found ogv files which contains field "Content Type", > "Name" & "Role" in their headers. It would be fine, to add additional message header fields to the two files with skeleton. ::: content/media/test/manifest.js @@ +258,5 @@ > +var gMultitrackInfoOggPlayList = [ > + { name:"sample_fisbone.ogv", type:"video/ogg", duration:5.016 }, > + { name:"multiple-bos.ogg", type:"video/ogg", duration:0.431 }, > + { name:"seek.ogv", type:"video/ogg", duration:3.966, size:285310 }, > + { name:"sound.ogg", type:"audio/ogg" } Why do some of these have durations and sizes, and others not? I think it makes more sense to put the expected result values here, rather than in the test file. Unless things like kind=main, id=audio_1 are always going to be predictable, in which case hard code them in the test. ::: content/media/test/test_mediatrack_parsing_ogg.html @@ +46,5 @@ > + if (r.hasOwnProperty("video_id") && r.hasOwnProperty("video_kind")) { > + is(e.videoTracks.length, 1, "The length of video track should be 1"); > + is(e.videoTracks[0].id, r.video_id, "File: " + msg + ", Video track Id OK !"); > + is(e.videoTracks[0].kind, r.video_kind, "File: " + msg + ", Video track Kind OK !"); > + } This will silently pass if the file isn't in results. I think it would be better to use ok() to test that the properties you're checking exist and are not null. Keep in mind also that the string has to work when the test fails. "Video track kind" is better than 'Video track Kind OK !" @@ +79,5 @@ > + > +SpecialPowers.pushPrefEnv({ > + "set": [["media.track.enabled", true]] > +}); > +manager.runTests(gMultitrackInfoOggPlayList, startTest); You have to pass the function to run with the Env change as a second argument to pushPrefEnv: SpecialPowers.pushPrefEnv({"set": [["media.track.enabled", true]]}, function() { manager.runTests(gMultitrackInfoOggPlayList, startTest); } );
Attachment #8472915 - Flags: review?(giles) → review-
Hello Giles, Thank you again for these suggestions and hints :) I changed the data structure of MessageField via using a hash table inside like you've suggested. It looks much better !(I guess :)) Also more check before each pointer really accesses packet address. (No need to put class CheckedInt to every variable, should be more reasonable) In addition, not only the ROLE "commentary" was added, I also added mappings for other ROLEs which could be found in the categories for AudioTrack/VideoTrack. To make it more completed. Regarding the 2nd patch for test. Sorry, I need more time to finish it. Since ffmpeg2thero curretnly hard code the content of message header, I'm trying to build ffmpeg & ffmpeg2theora source code to add additional message header fields for ogg/ogv files. But there seems to be some dependency/version problem, still figuring it out :( I'll upload 2nd patch soon once I make it ! Thanks for your review.
Attachment #8472911 - Attachment is obsolete: true
Attachment #8475762 - Flags: review?(giles)
Comment on attachment 8475762 [details] [diff] [review] Part1 - Parsing ogg fisbone and fill into audiotrack/videotrack.diff Review of attachment 8475762 [details] [diff] [review]: ----------------------------------------------------------------- Hash table is much better, thanks! Still needs some fixes to the bounds checking in the parser. Then I think we're good. ::: content/media/ogg/OggCodecState.cpp @@ +1137,4 @@ > static bool IsSkeletonBOS(ogg_packet* aPacket) > { > + static_assert(SKELETON_MIN_HEADER_LEN >= 28, > + "Length of Skeleton BOS Header should be greater than 28"); Hmm. I suppose this works as a second check on spec length. I meant to just compare with 8, since that's the number of bytes you're about to read. I guess it's ok, because if the spec value gets smaller the assert will fail, and if it gets longer we're still ok. Mostly I'm just having to think about where 28 came from. :) @@ +1390,5 @@ > +{ > + CheckedUint32 packetSize(aPacket->bytes); > + if (!packetSize.isValid()) { > + return false; > + } So this checks that the long aPacket->bytes fits in a uint32_t inside the constructor? Neat, I didn't know there was a type variant for that. Might be simpler just to compare with MaxValue<uint32_t> and use a bare type after that. Depends if you need to check for overflow with it... @@ +1401,5 @@ > + return false; > + } > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + > + CheckedUint32 checked_fields_pos(FISBONE_MSG_FIELDS_OFFSET + offsetMsgField); You need to construct the CheckedUint32 first, and then add the value read from the file in a separate step, to activate the checked arithmetic. @@ +1423,5 @@ > + {"TrackOrder:", eTrackOrder}, > + {"Track dependencies:", eTrackDependencies} > + }; > + > + int64_t msgLength = int64_t(msgFieldLen); Why are we back to a 64 bit type after using uint32_t above? @@ +1433,5 @@ > + // The content of message header fields follows [RFC2822], and the > + // mandatory message field must be encoded in US-ASCII, others > + // must be be encoded in UTF-8. "Content-Type" must come first > + // for all of message header fields. > + // See http://svn.annodex.net/standards/draft-pfeiffer-oggskeleton-current.txt. It doesn't look like you verify that "Content-Type" comes first. This is a MUST in the spec, so you should reject streams where it's not true. @@ +1439,5 @@ > + EMsgHeaderType eHeaderType = kFieldTypeMaps[i].mMsgHeaderType; > + if (!field->mValuesStore.Contains(eHeaderType)) { > + uint32_t nameLen = strlen(kFieldTypeMaps[i].mPatternToRecognize); > + field->mValuesStore.Put(eHeaderType, new nsCString(msgHead+nameLen, > + msgProbe-msgHead-nameLen)); Do you need to strip any leading/trailing whitespace from the message header value? The spec says there can be more than zero or more spaces after the colon. Does the spec say whether, with duplicate entries, the first or last occurance takes precedence? If not, figure out what email and http do and copy that. You should also implement the indent line continuation support, but that can be a separate bug. @@ +1447,5 @@ > + } > + } > + msgProbe += 2; > + msgHead = msgProbe; > + msgLength -= 2; Please update msgProbe and msgLength on adjacent lines. ::: content/media/ogg/OggReader.cpp @@ +118,5 @@ > +static void InitTrack(MessageField* aMsgInfo, TrackInfo* aInfo, bool aEnable) > +{ > + if (!aMsgInfo || !aInfo) { > + return; > + } Since you always call this with non-null arguments, you can MOZ_ASSERT(aMsginfo); MOZ_ASSERT(aInfo); here instead. Release builds will crash (but shouldn't unless there's a bug) and debug builds will say why release builds are crashing. @@ +126,5 @@ > + nsCString* sLanguage = aMsgInfo->mValuesStore.Get(eLanguage); > + aInfo->Init(sName? NS_ConvertASCIItoUTF16(*sName):EmptyString(), > + sRole? GetKind(*sRole):EmptyString(), > + sTitle? NS_ConvertASCIItoUTF16(*sTitle):EmptyString(), > + sLanguage? NS_ConvertASCIItoUTF16(*sLanguage):EmptyString(), NS_ConvertUTF8toUTF16()
Attachment #8475762 - Flags: review?(giles) → review-
> @@ +1390,5 @@ > > +{ > > + CheckedUint32 packetSize(aPacket->bytes); > > + if (!packetSize.isValid()) { > > + return false; > > + } > > So this checks that the long aPacket->bytes fits in a uint32_t inside the > constructor? Neat, I didn't know there was a type variant for that. > > Might be simpler just to compare with MaxValue<uint32_t> and use a bare type > after that. Depends if you need to check for overflow with it... The use of CheckedUint32 for "aPacket->bytes" will be removed. Previously, I just use it to verify if the variable is negative or not. But yeah, directly comparing it would be more intuitive, thanks :) > @@ +1423,5 @@ > > + {"TrackOrder:", eTrackOrder}, > > + {"Track dependencies:", eTrackDependencies} > > + }; > > + > > + int64_t msgLength = int64_t(msgFieldLen); > > Why are we back to a 64 bit type after using uint32_t above? Originally, it was used to make the while-loop break if "msgLength" becomes negative. But now I think using int64_t type at the beginning when msgLength is calculated would be a simpler and better way. > @@ +1439,5 @@ > > + EMsgHeaderType eHeaderType = kFieldTypeMaps[i].mMsgHeaderType; > > + if (!field->mValuesStore.Contains(eHeaderType)) { > > + uint32_t nameLen = strlen(kFieldTypeMaps[i].mPatternToRecognize); > > + field->mValuesStore.Put(eHeaderType, new nsCString(msgHead+nameLen, > > + msgProbe-msgHead-nameLen)); > > Do you need to strip any leading/trailing whitespace from the message header > value? The spec says there can be more than zero or more spaces after the > colon. I saw this on spec, and I tend not to strip whitespace from these values. In my p.o.v, parser should only do content parsing, and transfer these "original" content to the one who use them. Parser should not modify content. But, I'm not quit sure about this here...@@ What do you think ? > > Does the spec say whether, with duplicate entries, the first or last > occurance takes precedence? If not, figure out what email and http do and > copy that. I didn't see descriptions regarding duplicated entries on spec, and I find how http handles response header [1]. It keeps the 1st value. Would that be ok to keep the same behavior (current) here ? > > @@ +126,5 @@ > > + nsCString* sLanguage = aMsgInfo->mValuesStore.Get(eLanguage); > > + aInfo->Init(sName? NS_ConvertASCIItoUTF16(*sName):EmptyString(), > > + sRole? GetKind(*sRole):EmptyString(), > > + sTitle? NS_ConvertASCIItoUTF16(*sTitle):EmptyString(), > > + sLanguage? NS_ConvertASCIItoUTF16(*sLanguage):EmptyString(), > > NS_ConvertUTF8toUTF16() Oops! Sorry for my careless mistake. [1] http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.h#27
Hello Ralph, I've addressed the comment from Comment 38 and explained some in comment 39. Again, thank you very much :)
Attachment #8475762 - Attachment is obsolete: true
Attachment #8477333 - Flags: review?(giles)
Yes, I finally successfully built ffmpeg2theora and have done modification to add addition information to Ogg fisbone message header field :) Per Comment 36, > Why do some of these have durations and sizes, and others not? Sorry, I just copy/paste those file information from other pieces of code in content/media/test/manifest.js, and didn't notice their inconsistent information. >> + if (r.hasOwnProperty("video_id") && r.hasOwnProperty("video_kind")) { >> + is(e.videoTracks.length, 1, "The length of video track should be 1"); >> + is(e.videoTracks[0].id, r.video_id, "File: " + msg + ", Video track Id OK !"); >> + is(e.videoTracks[0].kind, r.video_kind, "File: " + msg + ", Video track Kind OK !"); >> + } >This will silently pass if the file isn't in results. I think it would be better to use ok() to test that the properties you're checking exist and are not null. I've added a check to verify if the playing file is one of our tested files. And also check the audio/video availability for both expected results and parsed results. After doing so, I guess there won't be any silent pass if the file isn't in the result :)
Attachment #8472915 - Attachment is obsolete: true
Attachment #8477342 - Flags: review?(giles)
Comment on attachment 8477342 [details] [diff] [review] Part 2 - Test case for parsing Ogg skeleton fisbone header_v2.diff Review of attachment 8477342 [details] [diff] [review]: ----------------------------------------------------------------- r=me with comments addressed. ::: content/media/test/test_mediatrack_parsing_ogg.html @@ +44,5 @@ > + var v = document.createElement('video'); > + v.preload = "metadata"; > + v.token = token; > + manager.started(token); > + var checkMetadata = localCheckMetadata; Do you need this indirection? @@ +63,5 @@ > + v.addEventListener("loadedmetadata", check, false); > + // We should get "ended" events for every resource > + v.addEventListener("ended", checkEnded, false); > + document.body.appendChild(v); > + v.play(); Do you need to play these all the way through and wait for ended, or is just appending the video to the document enough to trigger loadedmetadata? If you don't need to wait for ended the test will be faster.
Attachment #8477342 - Flags: review?(giles) → review+
(In reply to Kilik Kuo [:kilikkuo] from comment #39) > I didn't see descriptions regarding duplicated entries on spec, and I find > how http handles response header > http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.h#27 > It keeps the 1st value. Thanks for digging that up. More precisely, or code has a list of singleton headers, for which the first encounter value takes precendence. All other headers are merged, either with a comma as a joiner, or in the case of some types like cookies, with a newline joiner. > Would that be ok to keep the same behavior (current) here? It's fine for this patch, but please open a bug about improving the implementation, even if you don't intend to work on it yourself, so we know there's more work to be done to match the spec.
Comment on attachment 8477342 [details] [diff] [review] Part 2 - Test case for parsing Ogg skeleton fisbone header_v2.diff Review of attachment 8477342 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/test/test_mediatrack_parsing_ogg.html @@ +59,5 @@ > + removeNodeAndSource(v); > + manager.finished(v.token); > + } > + }}(test, v); > + v.addEventListener("loadedmetadata", check, false); Hey Kilik, please use v.onloadedmetada = function() {}; (This is simpler, and same as the onended event handeler) and make sure to null out event handlers before finishing the test.
Thanks Ralph & Shelly, Yes, there's no need to wait for ended against these testing audio/video. I removed that. Also using |v.onloadedmetadata = function ()| instead of v.addEventListener is much clear. :) carry r+ from Comment 42.
Attachment #8477342 - Attachment is obsolete: true
Attachment #8478858 - Flags: review+
Blocks: 1058444
(In reply to Ralph Giles (:rillian) from comment #43) > (In reply to Kilik Kuo [:kilikkuo] from comment #39) > > > I didn't see descriptions regarding duplicated entries on spec, and I find > > how http handles response header > > http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.h#27 > > It keeps the 1st value. > > Thanks for digging that up. More precisely, or code has a list of singleton > headers, for which the first encounter value takes precendence. All other > headers are merged, either with a comma as a joiner, or in the case of some > types like cookies, with a newline joiner. > Thanks for the explanation, makes it more clear to me :) > > Would that be ok to keep the same behavior (current) here? > > It's fine for this patch, but please open a bug about improving the > implementation, even if you don't intend to work on it yourself, so we know > there's more work to be done to match the spec. I've filed a bug (Bug 1058444) to track implementation for indent line continuation support and handle duplicated field values in a flexible way.
Comment on attachment 8477333 [details] [diff] [review] Part1 - Parsing ogg fisbone and fill into audiotrack/videotrack_v3.diff Review of attachment 8477333 [details] [diff] [review]: ----------------------------------------------------------------- r=me with comments addressed. Thanks for doing all the iterations! ::: content/media/ogg/OggCodecState.cpp @@ +1399,5 @@ > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + > + CheckedUint32 checked_fields_pos = CheckedUint32(FISBONE_MSG_FIELDS_OFFSET) + offsetMsgField; > + if (!checked_fields_pos.isValid() || > + aPacket->bytes <= checked_fields_pos.value()) { Would the == case happen if there are no message fields? @@ +1420,5 @@ > + {"Track dependencies:", eTrackDependencies} > + }; > + > + bool isContentTypeParsed = false; > + while (msgLength >= 1) { msgLength > 1 or msgLength >= 2. Length must be at least 2 to read msgProbe[1]. @@ +1423,5 @@ > + bool isContentTypeParsed = false; > + while (msgLength >= 1) { > + if (*msgProbe == '\r' && *(msgProbe+1) == '\n') { > + nsAutoCString strMsg(msgHead, msgProbe-msgHead); > + for (uint32_t i = 0; i < ArrayLength(kFieldTypeMaps); i++) { ArrayLength returns a size_t. Probably better to use that for i as well.
Attachment #8477333 - Flags: review?(giles) → review+
Addressed issues and carry r+ from Comment 47 ! Fixed several defects found after pushing try. Please see Comment 48.
Attachment #8477333 - Attachment is obsolete: true
Attachment #8482094 - Flags: review+
Comment on attachment 8482094 [details] [diff] [review] Part1 - Parsing ogg fisbone and fill into audiotrack/videotrack_v4.diff Review of attachment 8482094 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/ogg/OggCodecState.cpp @@ +1392,5 @@ > + return false; > + } > + uint32_t offsetMsgField = LittleEndian::readUint32(aPacket->packet + FISBONE_MSG_FIELDS_OFFSET); > + > + if (aPacket->bytes < static_cast<long>(FISBONE_SERIALNO_OFFSET + 4)) { static_cast<long> is added to make compile error free. @@ +1399,5 @@ > + uint32_t serialno = LittleEndian::readUint32(aPacket->packet + FISBONE_SERIALNO_OFFSET); > + > + CheckedUint32 checked_fields_pos = CheckedUint32(FISBONE_MSG_FIELDS_OFFSET) + offsetMsgField; > + if (!checked_fields_pos.isValid() || > + aPacket->bytes < static_cast<int64_t>(checked_fields_pos.value())) { static_cast<int64_t> is added to make compile error free. ::: content/media/ogg/OggReader.cpp @@ +342,5 @@ > + nsIntSize displaySize = nsIntSize(theoraState->mInfo.pic_width, > + theoraState->mInfo.pic_height); > + nsIntSize frameSize(theoraState->mInfo.frame_width, > + theoraState->mInfo.frame_height); > + ScaleDisplayByAspectRatio(displaySize, theoraState->mPixelAspectRatio); |ScaleDisplayByAspectRatio(displaySize, theoraState->mPixelAspectRatio)| is added, since the correct displaySize would be calculated by this function, so that we could pass the Reftest(layout/reftests/ogg-video/) @@ +439,5 @@ > + if (!mTheoraState) { > + TheoraState* theoraState = static_cast<TheoraState*>(s); > + SetupTargetTheora(theoraState); > + } else { > + s->Deactivate(); Not removing the call |Reset()| out of |Deactivate()| to make mochitest (content/media/test/test_buffered.html) pass. Why ? Currently, we obtain Ogg file duration from active bitstream and check if the duration is the same as its buffered RangeEndTime [1]. So if we keep packet data for those inactive bitstreams. Mochitest would fail at [2]. (Non-primary bitstream end time is found first and its value is 4.424, but duration of active primary is 4.208333). I tend to file a bug to track this scenario, the buffered(may contain active/inactive data) range ended time should not always be the same as media duration(active bitstream). Either modifying the test case, or always returning range end time for active bitstream. And |Reset()| shouldn't be called in |Deactivate()|, because |Reset()| cleans up packet data. It does things more than "deactivate". Hi Ralph, Could you give me some suggestions ? [1] http://dxr.mozilla.org/mozilla-central/source/content/media/ogg/OggReader.cpp#947 [2] http://dxr.mozilla.org/mozilla-central/source/content/media/test/test_buffered.html#33
(In reply to Kilik Kuo [:kikuo][:kilikkuo] from comment #49) > @@ +439,5 @@ > > + if (!mTheoraState) { > > + TheoraState* theoraState = static_cast<TheoraState*>(s); > > + SetupTargetTheora(theoraState); > > + } else { > > + s->Deactivate(); > > Not removing the call |Reset()| out of |Deactivate()| to make mochitest > (content/media/test/test_buffered.html) pass. > > Why ? > Currently, we obtain Ogg file duration from active bitstream and check if > the duration is the same as its buffered RangeEndTime [1]. > So if we keep packet data for those inactive bitstreams. Mochitest would > fail at [2]. > (Non-primary bitstream end time is found first and its value is 4.424, but > duration of active primary is 4.208333). > > I tend to file a bug to track this scenario, the buffered(may contain > active/inactive data) range ended time should not always be the same as > media duration(active bitstream). > Either modifying the test case, or always returning range end time for > active bitstream. > And |Reset()| shouldn't be called in |Deactivate()|, because |Reset()| > cleans up packet data. It does things more than "deactivate". > > Hi Ralph, > Could you give me some suggestions ? > > [1] > http://dxr.mozilla.org/mozilla-central/source/content/media/ogg/OggReader. > cpp#947 > [2] > http://dxr.mozilla.org/mozilla-central/source/content/media/test/ > test_buffered.html#33 I guess the necessity of |Reset()| in |Deactivate()| could be discussed once we're going to implement the support of selecting non-primary tracks, since the undecoded-queued packet data in non-primary bitstream CodecState are cleaned up when they've got deactivated. So that the RangeEndTime of non-primary bitstream won't be obtained.
Carry r+ from rillian, fixed conflicts due to rebase.
Attachment #8482094 - Attachment is obsolete: true
Attachment #8484734 - Flags: review+
(In reply to Kilik Kuo [:kikuo][:kilikkuo] from comment #52) > Created attachment 8484734 [details] [diff] [review] > Part1 - Parsing ogg fisbone and fill into audiotrack/videotrack_v5.diff > > Carry r+ from rillian, fixed conflicts due to rebase. https://tbpl.mozilla.org/?tree=Try&rev=210a9693f4f7 This try result is for Comment 52.
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: