942078 - Video thumbnail generation rule

Reporter

Description

•

11 years ago

Current video's thumbnail is generated by seeking to 1/4 video duration and getting that I-frame decoded. This frame may be not good since it could be black screen or some meaningless frame to users. Another way is to decode the largest I-frame which more likely contains rich information or has high quality.

Marco Chen [:mchen]

Comment 1

•

11 years ago

Hi Chris & David, According to Gecko media & Gaia video app domain, may we know your opinion on this improvement of generating video thumbnail? Thanks.

Flags: needinfo?(dflanagan)

Flags: needinfo?(chris.double)

cajbir (:cajbir)

Comment 2

•

11 years ago

(In reply to Marco Chen [:mchen] from comment #1) > > According to Gecko media & Gaia video app domain, may we know your opinion > on this improvement of generating video thumbnail? I think this is a good approach. A while back we looked at doing a thumbnail specific API in the media code and that was one of the approaches suggested (bug 873959).

Flags: needinfo?(chris.double)

David Flanagan [:djf]

Comment 3

•

11 years ago

The current approach comes from very early UX specs for the video app. The idea is that many professional videos begin with a blank frame, so it doesn't make sense to start at the first frame. IIRC, we don't do 1/4 of the duration. We do 10 seconds or 1/4, which ever is smaller. But there is no real logic to it, we just hope that that is a better frame than the first one is. If you want to do something better, that would be great. Just picking the largest frame seems like it might get you an image that was full of text or something else that is hard to compress. It seems like it would be better if you could do something like start at the beginning and look for a jump in frame size to find the first non-trivial frame. But I'm completely out of my expertese here. If you can do something smart here, we'll certainly use it for the video app. Needinfo John to bring this to his attention.

Flags: needinfo?(dflanagan) → needinfo?(johu)

John Hu [:johnhu][:johu][:醬糊小弟]

Comment 4

•

11 years ago

Thanks for bringing this to me, David. I don't think the frame at 1/4 will be totally useless. We uses canvas to draw the thumbnail and assume that the Video element helps to decode to the frame of specified time. That may or may not be a I-frame but at least a frame which is constructed from it's I-frame and applying some P/B-frames to have the correct image. From my knowledge of H.264, an I-frame contains full image/information and all frames after that I-frame are constructed from that I-frame. It shouldn't be a black or partial black image if we constructed a frame from I-frame and applying all frame changes after that I-frame to the specified time. Another issue is every video may have different I-frame interval. Some of them may use constant time interval but some of them are not. And the creator of that video can change the strategy of creating I-frame. I feel if we can find a frame whose color histogram is as a bell shape, that may be a good frame to be the video thumbnail.

Flags: needinfo?(johu)

Blake Wu [:bwu][:blakewu]

Reporter

Comment 5

•

11 years ago

Usually when seeking in video playback, it seeks to the nearest I-frame around the seeking time. The reason behind this is if it seeks to the frame with exact seeking time, that frame may not be a I-frame. If it is not I-frame, media extractor still needs to find the I-frame before it and decode I-frame with the following P/B frames until reaching the seeking time. This may take some time and user expericen may not be good. The reason to pick up the largest size of I-frames is if the raw data of that frame is black or plain, the encoded size would be small. From my understanding of H.264, the more information it contains, the bigger size it has. So it is a pretty simple way to filter out some "bad" frames. For checking color histogram, it is also a better way. However, it may need to take some time to decode and analyze each I-frame. Not sure how to do it in a quick way.

cajbir (:cajbir)

Comment 6

•

11 years ago

(In reply to Blake Wu from comment #5) > If it is not > I-frame, media extractor still needs to find the I-frame before it and > decode I-frame with the following P/B frames until reaching the seeking > time. This may take some time and user expericen may not be good. This is what the HTML video element does. It seeks to the nearest key frame then decodes up to the requested seek frame. Implementing the faster, but less accurate, method of seeking to key frame only is bug 778077.

John Hu [:johnhu][:johu][:醬糊小弟]

Comment 7

•

11 years ago

Wow! This's a pretty old bug. I never found that we are going to implement the fast seek. That may be very useful to deal this kind of . There is an extreme case for this kind of API. We may have a key frame per minute or more which causes the fast seek absolute out of sync. And I know that's the trade-off. I think Blake is trying to fix two things at one shot: 1. fast seek and 2. the significant thumbnail for this video. For the first one, that is exactly the bug 778077. For the second one, that 's largest key-frame. From the view of gaia, we don't have the API to do "fast seek". If we have, I feel that's good to use it to generate thumbnail. But I still wondering that what's the difference if we use hardware codec. That should be very fast and user may not be aware of the difference. That does provide a large improvement when we fallback to software codec. (In reply to Chris Double (:doublec) from comment #6) > This is what the HTML video element does. It seeks to the nearest key frame > then decodes up to the requested seek frame. Implementing the faster, but > less accurate, method of seeking to key frame only is bug 778077.

Sotaro Ikeda [:sotaro]

Comment 8

•

11 years ago

bug 873959 seems easy way to implement this. In android stagefright, mp4 file's thumbnail is determined biggest sample in 20 sync samples from front. SampleTable::findThumbnailSample() http://androidxref.com/4.4_r1/xref/frameworks/av/media/libstagefright/SampleTable.cpp#727

Sotaro Ikeda [:sotaro]

Comment 9

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #8) > bug 873959 seems easy way to implement this. In android stagefright, mp4 > file's thumbnail is determined biggest sample in 20 sync samples from front. > > SampleTable::findThumbnailSample() > http://androidxref.com/4.4_r1/xref/frameworks/av/media/libstagefright/ > SampleTable.cpp#727 Thumbnail time can be checked also by gaia side.

John Hu [:johnhu][:johu][:醬糊小弟]

Comment 10

•

11 years ago

Hi Sotaro, We don't have libstagefright in Gecko, see bug 778052. So, We can't have thumbnail time in metadata. We may only have it in ogg file format, since the landing of bug 763010.