Closed
Bug 778053
Opened 12 years ago
Closed 12 years ago
Support WAV tag metadata
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: rillian, Assigned: czchen)
References
Details
(Whiteboard: [mentor=rillian] [lang=c++])
Attachments
(2 files, 13 obsolete files)
81.29 KB,
patch
|
kinetik
:
review+
|
Details | Diff | Splinter Review |
13.39 KB,
patch
|
kinetik
:
review+
|
Details | Diff | Splinter Review |
WAV audio files can have metadata giving title, artist, etc. This bug proposes to modify nsWaveReader so as to provide those tags to webcontent via ::ReadMetadata() and media.mozGetMetadata().
Reporter | ||
Updated•12 years ago
|
Whiteboard: [mentor=rillian] [lang=c++]
(In reply to Ralph Giles (:rillian) from comment #0) > WAV audio files can have metadata giving title, artist, etc. > > This bug proposes to modify nsWaveReader so as to provide those tags to > webcontent via ::ReadMetadata() and media.mozGetMetadata(). rillian ChangZhuo is
rillian, ChangZhuo is interested in this bug. And I think he can help.
Reporter | ||
Comment 3•12 years ago
|
||
Thanks, Thomasy, for the introduction. Hi ChangZhuo, thanks for taking an interest in this bug! Do you know how you want to proceed? In general idea is to replace the *aTags = nullptr assignment in nsWaveReader::ReadMetadata() with the output of a new function. That function, called something like nsWaveReader::GetTags(), creates and returns a pointer to a new nsHTMLMediaElement::MetadataTags hash table. The example in nsOggReader.cpp may be a helpful guide. I know less about the WAVE metadata chunk, and what we should support there, but I now the free Audacity programme can create .wav files with embedded metadata. That's a good place to start. Please let me know right away if you have any questions. I'm happy to help.
Assignee | ||
Comment 4•12 years ago
|
||
Hi Ralph, For wave format, I found a link[1] that describe how wave store the metadate. The format is simple tag-length-value structure with ASCII tag name, and we only need to extract the INFO list chunk in LIST chunk to get the metadata. However, there are some problems I need advice: 1. There is no chunk order In current design, it assumes fmt chunk always appears before data chunk. It is reasonable because we need fmt chunk to decode data chunk. However, the order of chunk is not specified in spec, so we cannot use nsWaveReader::ScanForwardUntil() to find the LIST chunk. My plan for this is adding a new function nsWaveReader::GetNextChunk() to get the next chunk and parse it if necessary. However, in this case, we need to use mDecoder->GetResource()->Seek() to adjust resource pointer, and I am not sure if it causes other side effect. 2. ReentrantMonitorAutoEnter ReentrantMonitorAutoEnter looks like a RAII class, but I do not know the purpose if this. Please tell me if anyone know about this class, thanks. ref: [1] http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html Regards, ChangZhuo
Comment 5•12 years ago
|
||
ReentrantMonitorAutoEnter is indeed an RAII class that enter and exits the monitor that is passed during its construction. A monitor is a synchronization object, I'm not sure how familiar you are with this concept, see [1] ([4] if you want to read the relevant code, click on the links to navigate). In your context (the nsWaveReader), you enter the decoder monitor (see [2]) when you want to call certain methods of the state machine, that change its state. For example in [2], the call informs the state machine of the new duration you found while decoding the media. Because the duration can be accessed by the state machine thread and the decoder thread, we need some kind of synchronization. At that point, you should probably read [5] to have an overview of how the media code is architectured in Gecko so the last paragraph makes sense. Anyway. Usually, if you wonder whether you should call some method while being in a monitor, check the definition of the method you want to call, and if it contains |mDecoder->GetReentrantMonitor().AssertCurrentThreadIn();|, that means this will Assert that the Current Thread is In the monitor, and blow up (in a debug build), if you forgot to use a monitor. [1]: http://en.wikipedia.org/wiki/Monitor_%28synchronization%29 [2]: http://mxr.mozilla.org/mozilla-central/source/content/media/wave/nsWaveReader.cpp#141 [3]: http://mxr.mozilla.org/mozilla-central/source/content/media/nsBuiltinDecoderStateMachine.cpp#1349 [4]: http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/ReentrantMonitor.h#168 [5]: http://blog.pearce.org.nz/2011/02/firefox-4-video-decoder-architecture.html
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #4) > My plan for this is adding a new function nsWaveReader::GetNextChunk() to > get the next chunk and parse it if necessary. However, in this case, we need > to use mDecoder->GetResource()->Seek() to adjust resource pointer, and I am > not sure if it causes other side effect. Calling ::GetNextChunk and dispatching to the various parsers sounds reasonable, but we want to be careful not to scan the whole file looking for metadata. It should probably start decoding as soon as it's seen both the fmt and data chunk headers. Are there files with the LIST::INFO chunk at the end? We could seek past the data chunk (and any other large chunks) to find that, but over the network this always adds delay.
Assignee | ||
Comment 7•12 years ago
|
||
(In reply to Paul ADENOT (:padenot) from comment #5) Hi Paul, Thank for your explanation. So the reason why ReentrantMonitorAutoEnter is needed in [1] and [2] is because nsWaveReader tries to update the data like sample rate, channel, frame size, ... that will affect the decoder. If I do not update anything related to decode, ReentrantMonitorAutoEnter is not needed, right? [1] http://mxr.mozilla.org/mozilla-central/source/content/media/wave/nsWaveReader.cpp#449 [2] http://mxr.mozilla.org/mozilla-central/source/content/media/wave/nsWaveReader.cpp#481
Assignee | ||
Comment 8•12 years ago
|
||
(In reply to Ralph Giles (:rillian) from comment #6) According to [1], "There are no restrictions upon the order of the chunks within a WAVE file, with the exception that the Format chunk must precede the Data chunk.", so we cannot know where the LIST::INFO chunk is until finding it. However, because putting LIST::INFO chunk after data chunk is not streaming friendly, I think we can just stop parsing after data chunk is found. If there is no LIST::INFO before data chunk, there is possible no LIST::INFO chunk at all. [1] http://www.blitter.com/~russtopia/MIDI/~jglatt/tech/wave.htm
Reporter | ||
Comment 9•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #8) > I think we can just stop parsing after data chunk is found. I think that's the best plan for now. Once that's working we can look at whether seeking past the data chunk makes sense. -r
Assignee | ||
Comment 10•12 years ago
|
||
The [1] is full list of metadata stored in LIST::INFO. Currently I just implement the following tags: IART as artist ICMT as comments IGNR as genre INAM as name IPRD as product Do we need to implement other tags? or these tags are enough? For the `name' and `product', the names used in ogv and ogg are `title' and `album'. Do we need to synchronize tag names between wav and ogg? [1] http://www.warpspeed.com.au/cgi-bin/inf2html.cmd?..%5Chtml%5Cbook%5CToolkt40%5CMMREF3.INF+2218 [2] http://dxr.mozilla.org/mozilla-central/content/media/test/manifest.js.html#l323
Reporter | ||
Comment 11•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #10) > The [1] is full list of metadata stored in LIST::INFO. Currently I just > implement the following tags: > > IART as artist > ICMT as comments > IGNR as genre > INAM as name > IPRD as product > > Do we need to implement other tags? or these tags are enough? I'd like ICRD as date as well. > For the `name' and `product', the names used in ogv and ogg are `title' and > `album'. Do we need to synchronize tag names between wav and ogg? Ah! I wondered what people used for the album tag. Just use the native tag names. The plan is to provide another call that maps tags from every format to a shared vocabulary, in bug 793737. So for the mozGetMetadata interface it's best to just return them in whatever the native tag set is. Either the fourcc, or your mapping to english above. Does that sound good to you?
Reporter | ||
Comment 12•12 years ago
|
||
BTW, if you go with the fourcc codes, you could just return everything in the LIST::INFO without having to guess which would be important. That might be better.
Assignee | ||
Comment 13•12 years ago
|
||
Assignee | ||
Comment 14•12 years ago
|
||
Assignee | ||
Comment 15•12 years ago
|
||
Can someone help me to submit the test on try server? My account is not ready yet.
Comment 16•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #15) > Can someone help me to submit the test on try server? My account is not > ready yet. Please check https://tbpl.mozilla.org/?tree=Try&rev=588c42a8ef5d If you have any question do not hesitate to ping me on IRC.
Comment 17•12 years ago
|
||
Try run for 588c42a8ef5d is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=588c42a8ef5d Results (out of 14 total builds): failure: 14 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/thomas@thomasy.tw-588c42a8ef5d
Reporter | ||
Comment 18•12 years ago
|
||
You need to 'git add' or 'hg add' the new test files, I guess. BTW, it might be useful to have test files with zero and one comment, and an invalid comment the reader should not pass on, as well.
Assignee | ||
Comment 19•12 years ago
|
||
Attachment #664495 -
Attachment is obsolete: true
Assignee | ||
Comment 20•12 years ago
|
||
Attachment #664887 -
Attachment is obsolete: true
Assignee | ||
Comment 21•12 years ago
|
||
I updated the test cases. Now the following scenarios are tested: - No INFO tag tag (wavedata_u8.wav) - The INFO tag contains ASCII string (wave_metadata.wav) - The INFO tag contains UTF-8 string (wave_metadata_utf-8.wav) - The INFO tag contains bad length (wave_metadata_bad_len.wav) - The INFO tag string does not null-terminated (wave_metadata_bad_no_null.wav) - The INFO tag contains invalid UTF-8 character (wave_metadata_bad_utf8.wav) - The INFO tag contains unknown tag (wave_metadata_unknown_tag.wav)
Assignee | ||
Comment 22•12 years ago
|
||
Attachment #664494 -
Attachment is obsolete: true
Attachment #664890 -
Attachment is obsolete: true
Comment 23•12 years ago
|
||
Try run for 51de811b210e is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=51de811b210e Results (out of 14 total builds): failure: 14 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/thomas@thomasy.tw-51de811b210e
Comment 24•12 years ago
|
||
Try run for 744dad3c2468 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=744dad3c2468 Results (out of 16 total builds): exception: 16 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/czchen@gmail.com-744dad3c2468
Comment 25•12 years ago
|
||
Try run for b3fb0cbe6ff1 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=b3fb0cbe6ff1 Results (out of 321 total builds): exception: 1 success: 299 warnings: 21 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/thomas@thomasy.tw-b3fb0cbe6ff1
Reporter | ||
Comment 26•12 years ago
|
||
Don't forget to update the attached bug too. :)
Reporter | ||
Comment 27•12 years ago
|
||
Attached patch, rather. Too early in the morning.
Assignee | ||
Comment 28•12 years ago
|
||
Attachment #664902 -
Attachment is obsolete: true
Assignee | ||
Comment 29•12 years ago
|
||
Assignee | ||
Comment 30•12 years ago
|
||
I updated the attached patches. Please help to check if there is any problem, thanks.
Comment 31•12 years ago
|
||
Try run for 1c3d8fdcbdc8 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=1c3d8fdcbdc8 Results (out of 290 total builds): success: 274 warnings: 16 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/czchen@gmail.com-1c3d8fdcbdc8
Comment 32•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #30) > I updated the attached patches. Please help to check if there is any > problem, thanks. As a starter, if you want to get a formal review. You should set the review flag to r? and the reviewer to kinetik(kinetik@flim.org).
Assignee | ||
Updated•12 years ago
|
Attachment #666320 -
Flags: review?(kinetik)
Assignee | ||
Updated•12 years ago
|
Attachment #666321 -
Flags: review?(kinetik)
Assignee | ||
Comment 33•12 years ago
|
||
(In reply to Thomasy from comment #32) > As a starter, if you want to get a formal review. You should set the review > flag to r? and the reviewer to kinetik(kinetik@flim.org). I updated the review flag. Thanks for the help.
Assignee | ||
Comment 34•12 years ago
|
||
Attachment #666320 -
Attachment is obsolete: true
Attachment #666320 -
Flags: review?(kinetik)
Attachment #668758 -
Flags: review+
Assignee | ||
Comment 35•12 years ago
|
||
Attachment #668759 -
Flags: review?(kinetik)
Assignee | ||
Comment 36•12 years ago
|
||
Attachment #666321 -
Attachment is obsolete: true
Attachment #668758 -
Attachment is obsolete: true
Attachment #666321 -
Flags: review?(kinetik)
Attachment #668760 -
Flags: review?(kinetik)
Assignee | ||
Comment 37•12 years ago
|
||
Attachment #668759 -
Attachment is obsolete: true
Attachment #668759 -
Flags: review?(kinetik)
Attachment #668761 -
Flags: review?(kinetik)
Assignee | ||
Comment 38•12 years ago
|
||
Attachment #668760 -
Attachment is obsolete: true
Attachment #668760 -
Flags: review?(kinetik)
Attachment #668762 -
Flags: review?(kinetik)
Updated•12 years ago
|
Attachment #668762 -
Flags: review?(kinetik) → review+
Comment 39•12 years ago
|
||
Comment on attachment 668761 [details] [diff] [review] [patch] read wave metadata Review of attachment 668761 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the patch! I apologize for taking so long to review it. I meant to get to it last week, but it slipped off of my radar. ::: content/media/wave/nsWaveReader.cpp @@ +134,5 @@ > + bool findDataOffset = false; > + nsHTMLMediaElement::MetadataTags *tag = nullptr; > + > + // The spec does not mention abort the order of of fmt, data, LIST chunks, so > + // we try to find them one by one. IBM/MS RIFF/MCI Specification 1.0 states: The WAVE form is defined as follows. Programs must expect (and ignore) any unknown chunks encountered, as with all RIFF forms. However, <fmt-ck> must always occur before <wave-data>, and both of these chunks are mandatory in a WAVE file. And the INFO chunk is under the LIST chunk at a higher level in the tree, so the only valid orders are: RIFF WAVE FMT DATA LIST INFO or RIFF LIST INFO WAVE FMT DATA So please restrict the code to accept only those forms, as the previous code did. @@ +161,5 @@ > + break; > + > + case LIST_CHUNK_MAGIC: > + // LIST chunk is optional, so it is okay if we cannot read it. > + LoadListChunk(chunkSize, &tag); This will leak tag when the FRMT or DATA branch return an error. @@ +173,5 @@ > + > + // RIFF chunks are two-byte aligned, so round up if necessary. > + chunkSize += chunkSize % 2; > + > + mDecoder->GetResource()->Seek(nsISeekableStream::NS_SEEK_SET, offset + CHUNK_HEADER_SIZE + chunkSize); You need to test for an error and deal with it. It's not always possible to seek in streams, so that case needs to be handled. The old code had the nice attribute that it parsed the stream linearly, which these changes breaks by expecting to seek between chunks. I'm not *that* happy about that, because it could make loading some files significantly slower... and as I mentioned above, we still need to handle the case where the stream isn't seekable. Can we delay loading this metadata until later in some cases? Or perhaps for a first-pass at this, we could only load the metadata here it it appears before the WAVE chunk... but that's going to exclude an awful lot of files, I suspect. @@ +178,5 @@ > + } > + > + // Ensure mandatory information are read. > + if (!(loadFormatChunk && findDataOffset)) { > + return NS_ERROR_FAILURE; This will leak tag too. @@ +181,5 @@ > + if (!(loadFormatChunk && findDataOffset)) { > + return NS_ERROR_FAILURE; > + } > + > + mDecoder->GetResource()->Seek(nsISeekableStream::NS_SEEK_SET, mWavePCMOffset); Test for error. @@ +556,5 @@ > + // List chunks are always word (two byte) aligned. > + NS_ABORT_IF_FALSE(mDecoder->GetResource()->Tell() % 2 == 0, > + "LoadListChunk called with unaligned resource"); > + > + nsAutoArrayPtr<char> chunk(new char[aChunkSize + 1]); aChunkSize comes directly from the file, so this could be a huge value, including UINT32_MAX, which will overflow. @@ +570,5 @@ > + > + // If there are multiple LIST::INFO in wave, we only parse the first one, the > + // others are ignored. > + if (*aTags) { > + return false; I'd rather this check was removed and the caller avoided calling this function if the tags have already been read. @@ +600,5 @@ > + { 0x49534850, NS_LITERAL_CSTRING("sharpness") }, // ISHP > + { 0x49535243, NS_LITERAL_CSTRING("source") }, // ISRC > + { 0x49535246, NS_LITERAL_CSTRING("source_form") }, // ISRF > + { 0x49544348, NS_LITERAL_CSTRING("technician") }, // ITCH > + }; Do we want to read and expose all of these? Some of them seem nonsensical for audio, e.g. ICRP, IDIM, IDPI, ILGT, IPLT. @@ +605,5 @@ > + > + // Put a null at the end of chunk to ensure the nsCString does not read > + // beyond chunk. > + chunk[aChunkSize] = 0; > + const char * const end = chunk.get() + aChunkSize + 1; What if aChunkSize is large enough that the pointer wraps? Any time you're reading data from a file, you need to validate it *very* carefully. @@ +617,5 @@ > + // subchunk shall not larger than parent chunk. > + if (length > aChunkSize) { > + break; > + } > + nsCString val(p); Use the nsCString constructor that takes a pointer and a length, and pass in the length from the chunk size. @@ +621,5 @@ > + nsCString val(p); > + // Chunks in List::INFO are always word (two byte) aligned. So round up if > + // necessary. > + p += length; > + p += length % 2; Make the second line: p += p % 2; So that it matches the other rounding up code. @@ +627,5 @@ > + if (!IsUTF8(val)) { > + continue; > + } > + > + for (size_t i = 0; i < sizeof(ID_TO_NAME) / sizeof(ID_TO_NAME[0]); ++i) { Use mozilla::ArrayLength.
Attachment #668761 -
Flags: review?(kinetik) → review-
Assignee | ||
Comment 40•12 years ago
|
||
Hi, Thank for the review. I will fix the problems and update the patch. However, I still have some questions about the code, please help to check them, thanks. (In reply to Matthew Gregan [:kinetik] from comment #39) > > IBM/MS RIFF/MCI Specification 1.0 states: > > RIFF > WAVE > FMT > DATA > LIST > INFO > > or > > RIFF > LIST > INFO > WAVE > FMT > DATA For the wave format, could you double confirm it? It looks like the INFO list chunk is in the same level as data and fmt chunk. The following example is from IBM/MS RIFF/MCI Specification 1.0, around p71: Example of a PCM WAVE file with 44.1 kHz sampling rate, mono, 20 bits per sample: RIFF( 'WAVE' INFO(INAM("O Canada"Z)) fmt(1, 1, 44100, 132300, 3, 20) data( <wave-data> ) ) > @@ +161,5 @@ > > + break; > > + > > + case LIST_CHUNK_MAGIC: > > + // LIST chunk is optional, so it is okay if we cannot read it. > > + LoadListChunk(chunkSize, &tag); > > This will leak tag when the FRMT or DATA branch return an error. For the leak, is there any thing like nsAutoArrayPtr() for the raw pointer? or I need to check every possible branch? > > @@ +173,5 @@ > > + > > + // RIFF chunks are two-byte aligned, so round up if necessary. > > + chunkSize += chunkSize % 2; > > + > > + mDecoder->GetResource()->Seek(nsISeekableStream::NS_SEEK_SET, offset + CHUNK_HEADER_SIZE + chunkSize); > > You need to test for an error and deal with it. It's not always possible to > seek in streams, so that case needs to be handled. > > The old code had the nice attribute that it parsed the stream linearly, > which these changes breaks by expecting to seek between chunks. I'm not > *that* happy about that, because it could make loading some files > significantly slower... and as I mentioned above, we still need to handle > the case where the stream isn't seekable. I will following the original design to parse the stream linearly. > > Can we delay loading this metadata until later in some cases? Or perhaps > for a first-pass at this, we could only load the metadata here it it appears > before the WAVE chunk... but that's going to exclude an awful lot of files, > I suspect. I think we can assume that INFO list chunk always appers before data chunk because it is not stream friendly to put INFO list chunk after data chunk. I will try to download some wave files from Internet to see if there are lots of wave having INFO list chunk after data chunk. > > @@ +556,5 @@ > > + // List chunks are always word (two byte) aligned. > > + NS_ABORT_IF_FALSE(mDecoder->GetResource()->Tell() % 2 == 0, > > + "LoadListChunk called with unaligned resource"); > > + > > + nsAutoArrayPtr<char> chunk(new char[aChunkSize + 1]); > > aChunkSize comes directly from the file, so this could be a huge value, > including UINT32_MAX, which will overflow. For this part, I will set an upper bound for the chunk size and field size in case that we do not use too many memory. > > @@ +570,5 @@ > > + > > + // If there are multiple LIST::INFO in wave, we only parse the first one, the > > + // others are ignored. > > + if (*aTags) { > > + return false; > > I'd rather this check was removed and the caller avoided calling this > function if the tags have already been read. Ok. I will fix it. > > @@ +600,5 @@ > > + { 0x49534850, NS_LITERAL_CSTRING("sharpness") }, // ISHP > > + { 0x49535243, NS_LITERAL_CSTRING("source") }, // ISRC > > + { 0x49535246, NS_LITERAL_CSTRING("source_form") }, // ISRF > > + { 0x49544348, NS_LITERAL_CSTRING("technician") }, // ITCH > > + }; > > Do we want to read and expose all of these? Some of them seem nonsensical > for audio, e.g. ICRP, IDIM, IDPI, ILGT, IPLT. I just copy every tag defined in specification. For this part, maybe I just use four characters as tag. > > @@ +605,5 @@ > > + > > + // Put a null at the end of chunk to ensure the nsCString does not read > > + // beyond chunk. > > + chunk[aChunkSize] = 0; > > + const char * const end = chunk.get() + aChunkSize + 1; > > What if aChunkSize is large enough that the pointer wraps? > > Any time you're reading data from a file, you need to validate it *very* > carefully. I will set a upper bound for the aChunkSize. > > @@ +617,5 @@ > > + // subchunk shall not larger than parent chunk. > > + if (length > aChunkSize) { > > + break; > > + } > > + nsCString val(p); > > Use the nsCString constructor that takes a pointer and a length, and pass in > the length from the chunk size. You mean using nsCString val(p, length)? > > @@ +621,5 @@ > > + nsCString val(p); > > + // Chunks in List::INFO are always word (two byte) aligned. So round up if > > + // necessary. > > + p += length; > > + p += length % 2; > > Make the second line: > > p += p % 2; > > So that it matches the other rounding up code. Ok. I will fix it. > > @@ +627,5 @@ > > + if (!IsUTF8(val)) { > > + continue; > > + } > > + > > + for (size_t i = 0; i < sizeof(ID_TO_NAME) / sizeof(ID_TO_NAME[0]); ++i) { > > Use mozilla::ArrayLength. Ok. I will fix it. Regards, ChangZhuo
Comment 41•12 years ago
|
||
(In reply to ChangZhuo Chen from comment #40) > For the wave format, could you double confirm it? It looks like the INFO list > chunk is in the same level as data and fmt chunk. The following example is > from IBM/MS RIFF/MCI Specification 1.0, around p71: I think that example in the spec is wrong (or it really means a LIST INFO chunk and it's written in a confusing way), so let's ignore it for now. Of course, since it's in the spec, it may mean that files exist with that layout. If so, we'll deal with it when we come to it. Looking at gstreamer's RIFF and WAVE parsers (gst-plugins-base/gst-libs/gst/riff and gst-plugins-good/gst/wavparse), it only expects INFO chunks inside LIST chunks, so I think we're safe to stick to what I said. Also in the gst code, it also looks like the tag names may appear in some files in lower case (see gst_riff_parse_info where it upcases the tag ID read out of the RIFF). I don't know how common that is, but we might want to handle that. > For the leak, is there any thing like nsAutoArrayPtr() for the raw pointer? > or > I need to check every possible branch? Yeah, you can use nsAutoRef. Sorry, I had forgotten about that, otherwise I would've pointed you to it when I first commented. > I think we can assume that INFO list chunk always appers before data chunk > because it is not stream friendly to put INFO list chunk after data chunk. I > will try to download some wave files from Internet to see if there are lots > of > wave having INFO list chunk after data chunk. Wikipedia made it sound fairly common for them to occur at the end of the file: http://en.wikipedia.org/wiki/Resource_Interchange_File_Format#INFO_chunk_placement_problems ...and it makes sense, given that metadata is often added later and it's painful to rewrite entire files to add a small bit of data. You could test whether the stream is seekable, and if not, only read the INFO if it's at the start of the file. That's similar to what the gstreamer code mentioned above does. > I just copy every tag defined in specification. For this part, maybe I just > use four characters as tag. Ralph might have an opinion. My preference is to only include things that are obviously useful for audio, at least for now. We can always add more later, but it's difficult to remove things once they're exposed to web content. > You mean using nsCString val(p, length)? That's it. Thanks!
Comment 42•12 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #41) > Yeah, you can use nsAutoRef. Sorry, I had forgotten about that, otherwise I > would've pointed you to it when I first commented. Actually, plain old nsAutoPtr will also work and it's slightly simpler to use.
Assignee | ||
Comment 43•12 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #41) > Looking at gstreamer's RIFF and WAVE parsers > (gst-plugins-base/gst-libs/gst/riff and gst-plugins-good/gst/wavparse), it > only expects INFO chunks inside LIST chunks, so I think we're safe to stick > to what I said. In gst_wavparse_stream_headers in gstreamer, the GST_RIFF_TAG_data and GST_RIFF_TAG_LIST are in the same switch-case statement, so FMT, LIST, and DATA are all in the same level, like the following right? FMT LIST INFO DATA > Also in the gst code, it also looks like the tag names may appear in some > files in lower case (see gst_riff_parse_info where it upcases the tag ID > read out of the RIFF). I don't know how common that is, but we might want > to handle that. For this case, I think I can use the same way to handle it as gstreamer does. > You could test whether the stream is seekable, and if not, only read the > INFO if it's at the start of the file. That's similar to what the gstreamer > code mentioned above does. For the seekable test, is calling Seek and check for return is enough? > > You mean using nsCString val(p, length)? The length here contains the NULL in the string, shall I strip the NULL by using SetLength? or is there any other API I can use? Regards, ChangZhuo
Assignee | ||
Comment 44•12 years ago
|
||
The patch fix the following items: - Only allow the following wave layout: FMT LIST INFO DATA LIST INFO FMT DATA - Avoid using Seek() for nonseekable resource - Use nsAutoPtr to prevent tag leak - Limit the LIST::INFO to 64k - Read only artist, comments, genre, name in LIST::INFO - Support lower case ID in LIST::INFO - Use mozilla::ArrayLength for array length Please help to review it, thanks.
Attachment #668761 -
Attachment is obsolete: true
Attachment #678103 -
Flags: review?(kinetik)
Comment 45•12 years ago
|
||
Comment on attachment 678103 [details] [diff] [review] [patch] read wave metadata Review of attachment 678103 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay in reviewing this. :-( Just a few minor comments, r+ with those fixed. There have been a number of refactoring changes landed in the media code recently, so you will need to rebase your patch before landing it. The changes should all be straightforward renames, e.g. nsWaveDecoder has been renamed to WaveDecoder. ::: content/media/wave/nsWaveReader.cpp @@ +135,5 @@ > if (!loaded) { > return NS_ERROR_FAILURE; > } > > + nsAutoPtr<nsHTMLMediaElement::MetadataTags> tag; Call this "tags". @@ +541,5 @@ > + { 0x49474e52, NS_LITERAL_CSTRING("genre") }, // IGNR > + { 0x494e414d, NS_LITERAL_CSTRING("name") }, // INAM > + }; > + > + const char * const end = chunk.get() + aChunkSize; No space between char and *, please. @@ +548,5 @@ > + aTags->Init(); > + > + while (p + 8 < end) { > + uint32_t id = ReadUint32BE(&p); > + /* make uppercase. from gstreamer */ Use // for comments, and make the comment slightly more explicit, something like: // Uppercase tag id, inspired by GStreamer's Wave parser. @@ +553,5 @@ > + id &= 0xDFDFDFDF; > + > + uint32_t length = ReadUint32LE(&p); > + > + // subchunk shall not exceed parent chunk. Use a capital at start of comment. @@ +560,5 @@ > + } > + > + nsCString val(p, length); > + if (val[length - 1] == '\0') > + val.SetLength(length - 1); Use braces here please. @@ +635,5 @@ > + > + // Move forward to next chunk > + int64_t forward = chunkStart + chunkSize - GetPosition(); > + > + if (forward < 0) { chunkSize is unsigned and the results of GetPosition should be positive, so this can't be negative. Signed overflow is undefined, so using a less-than-zero test is unsafe. Please use CheckedInt64 instead.
Attachment #678103 -
Flags: review?(kinetik) → review+
Assignee | ||
Comment 46•12 years ago
|
||
Refactoring as requested. Please help to see if there is any problem, thanks. https://tbpl.mozilla.org/?tree=Try&rev=cc2cb12bed9d
Attachment #678103 -
Attachment is obsolete: true
Attachment #684955 -
Flags: review+
Assignee | ||
Updated•12 years ago
|
Attachment #684955 -
Flags: review+ → review?(kinetik)
Assignee | ||
Comment 47•12 years ago
|
||
Attachment #684955 -
Attachment is obsolete: true
Attachment #684955 -
Flags: review?(kinetik)
Attachment #684957 -
Flags: review?(kinetik)
Attachment #684955 -
Attachment is patch: true
Assignee | ||
Updated•12 years ago
|
Attachment #684957 -
Attachment is patch: true
Updated•12 years ago
|
Attachment #684957 -
Flags: review?(kinetik) → review+
Comment 48•12 years ago
|
||
Thank you!
Keywords: checkin-needed
Comment 49•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/87223b686e85 https://hg.mozilla.org/integration/mozilla-inbound/rev/706259475ba6
Flags: in-testsuite+
Keywords: checkin-needed
(In reply to Ralph Giles (:rillian) from comment #12) > BTW, if you go with the fourcc codes, you could just return everything in > the LIST::INFO without having to guess which would be important. That might > be better. I agree, I think this approach would be better. Simpler code for us, and gives the Web developer access even to metadata we don't recognize.
Comment 51•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/87223b686e85 https://hg.mozilla.org/mozilla-central/rev/706259475ba6
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
You need to log in
before you can comment on or make changes to this bug.
Description
•