Closed Bug 1167875 Opened 10 years ago Closed 9 years ago

Integrate OpenFST sources in to the build system under the media directory

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: kdavis, Assigned: kdavis)

References

Details

(Whiteboard: [webspeechapi][systemsfe])

Integrate OpenFST sources in to the build system under the media directory. The OpenFST library (http://www.openfst.org/twiki/bin/view/FST/WebHome) will be used to learn a WFST for each language which will be used to map a non-dictionary grapheme sequence to a phoneme sequence.
Whiteboard: [webspeechapi]
Depends on: 1167938
QA Contact: kdavis
Assignee: nobody → anatal
How large is the library, source code and binary? And how will it be actually used?
Here's the download link to the library http://www.openfst.org/twiki/bin/view/FST/FstDownload The size depends upon how much of the library Andre uses. As to how it will be used, the description gives an overview. But basically it will be used to create phonetic representation of words that are not the the installed dictionary. So, for example, "Sandip" is not in the English dictionary, but may be spoken to the device. The device needs to know a phonetic representation of "Sandip" before it is able to recognize someone speaking the word "Sandip". The OpendFST library will be used to train a model to be able to go from a written representation of a word "Sandip" to a phonetic representation needed by the speech recognizer. Such functionality will be critical for the "call <name>" function and the "play the song <song name> by <band name>" as it is likely that many names are not in the dictionary.
with "used" I mean, how it will be called. What code and where will use it?
(In reply to Olli Pettay [:smaug] from comment #3) > with "used" I mean, how it will be called. What code and where will use it? The plan is that it will be called from within Pocketsphinx through modifications that Andre makes to Pocketsphinx. Does that answer your question?
Will we provide the patches to pocketsphinx so that we could just end up using plain normal pocketsphinx with support for openfst integration? That would keep the setup simpler since we wouldn't have to always apply our patches to pocketsphinx when updating the library.
(In reply to Olli Pettay [:smaug] from comment #6) > Will we provide the patches to pocketsphinx so that we could just end up > using plain normal > pocketsphinx with support for openfst integration? This is the idea. But I think having Pocketsphinx integrate the patch on the Whistler timescale isn not going to happen. So I think for the near term we apply our patch to Pocketsphinx.
sure, totally fine.
(In reply to Olli Pettay [:smaug] (reviewing overload, no new review requests before June 1, pretty please) from comment #1) > How large is the library, source code and binary? > And how will it be actually used? The size of library both source and binary is almost the same than the cmudict that is already reviewed and are intended to be replaced by g2p.
and what "the same" might be ;)
(In reply to Olli Pettay [:smaug] (reviewing overload, no new review requests before June 1, pretty please) from comment #10) > and what "the same" might be ;) openfst-1.4.1 == 3,878,873 bytes cmudict-0.7b == 3,716,714 bytes
Assignee: anatal → kdavis
(In reply to Olli Pettay [:smaug] from comment #8) > sure, totally fine. Kelly will handle Openfst sources integration while I finish g2p algorithms into pocketsphinx
Whiteboard: [webspeechapi] → [webspeechapi][systemsfe]
Target Milestone: --- → 2.2 S14 (12june)
OpenFST uses the new C++11 class std::forward_list. However, stlport does not contain an implementation of std::forward_list. Thus, it seem that without a patch of OpenFST or the integration of a new version of STLport that contains std::forward_list, the integration of OpenFST seems impossible. Changing our implementation of STLport solely for OpenFST seems like a very bad idea as STLport lies at the core of so much code. Hence, the only solution seems to be a patch of OpenFST to remove its use of std::forward_list. So, this is what I will do, patch OpenFST to remove its use of std::forward_list.
Depends on: 1051148
I just added the "dependency" to ensure that we have a clean Web Speech implementation using only Pocketsphinx then after that an implementation using Pocketsphinx and OpenFST.
froydnj has been looking at getting Android building with libc++, I don't know if that work is far enough along to help you.
I have patches that build on locally and solve the STL problems. But, I want to try them on all platforms. However, the fallout from SCL3 issues - bug 1172750 is not letting me push to try last I tried. So, I'm just in a holding pattern right now. However, I think I have the solution.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15) > froydnj has been looking at getting Android building with libc++, I don't > know if that work is far enough along to help you. FYI For std::forward_list the solution is easy as std::forward_list is basically a singly linked list. So, all you do is make the substitution std::forward_list => std:list as well as make a few adjustments due to slight differences in the API's of std::forward_list and std:list, then you are done. The only drawback is that you use a bit more memory as std:list is a doubly linked list. So, for each contained item you use one more pointer.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15) > froydnj has been looking at getting Android building with libc++, I don't > know if that work is far enough along to help you. Unfortunately I've the solution for std::forward_list, simply using std::list. However, my solution for std::unordered_map, using std::tr1::unordered_map doesn't work in a cross platform way. On some platforms the include <unordered_map> gives std::tr1::unordered_map on others one does the include <tr1/unordered_map> to get std::unordered_map. Logical? So it seems, as in this comment, https://bugzilla.mozilla.org/show_bug.cgi?id=961289#c45 there is no easy way to support these new C++11 classes in a cross-platform way in gecko.
Target Milestone: 2.2 S14 (12june) → ---
Blocks: 1171043
QA Contact: kdavis
It seems that even if one solves the problem of how to access the new C++11 classes with an ugly hack such as the following: #include <unordered_map> using #if defined(_STLP_BEGIN_NAMESPACE) || defined(_STLP_BEGIN_TR1_NAMESPACE) #ifdef _STLP_BEGIN_NAMESPACE _STLP_STD_NAME:: #endif #ifdef _STLP_BEGIN_TR1_NAMESPACE tr1:: #endif #else std:: #endif unordered_map; the use of such classes is prevented by Bug 1059255. Bug 1059255 uses objdump to grep for unwanted symbols objdump -p $(1) | grep -e 'GLIBCXX_3\.4\.\(9\|[1-9][0-9]\)' that, in this case, include symbols pulled in by the use of unordered_map and other similar new C++11 classes. It then kills the build on finding any such symbols. So again, it seems as if the use of new C++11 classes are currently verboten and the only way to integrate the library is to re-write it
What symbols specifically does it find? We have taken steps to supply some of our own versions of necessary C++ symbols; it's possible similar techniques could be used here.
Flags: needinfo?(kdavis)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #20) > What symbols specifically does it find? We have taken steps to supply some > of our own versions of necessary C++ symbols; it's possible similar > techniques could be used here. Here's the relevant snippet from the try fail.... 78[1G[J[34m19:55.21(B[m [31mTEST-UNEXPECTED-FAIL(B[m | check_stdcxx | We do not want these libstdc++ symbols to be used: 78[1G[J[34m19:55.23(B[m 0000000000000000 DO *UND* 0000000000000000 GLIBCXX_3.4.10 _ZNSt8__detail12__prime_listE 78[1G[J[34m19:55.25(B[m gmake[5]: *** [jsep_session_unittest_standalone] Error 1 78[1G[J[34m19:55.25(B[m gmake[5]: *** Deleting file `jsep_session_unittest_standalone' 78[1G[J[34m19:55.25(B[m gmake[5]: *** Waiting for unfinished jobs.... 78[1G[J[34m19:55.25(B[m Unified_cpp_js_src9.o 78[1G[J[34m19:56.73(B[m [31mTEST-UNEXPECTED-FAIL(B[m | check_stdcxx | We do not want these libstdc++ symbols to be used: 78[1G[J[34m19:56.73(B[m 0000000000000000 DO *UND* 0000000000000000 GLIBCXX_3.4.10 _ZNSt8__detail12__prime_listE 78[1G[J[34m19:56.73(B[m gmake[5]: *** [mediapipeline_unittest_standalone] Error 1 78[1G[J[34m19:56.73(B[m gmake[5]: *** Deleting file `mediapipeline_unittest_standalone' 78[1G[J[34m19:56.91(B[m Unified_cpp_security_manager_ssl3.o 78[1G[J[34m19:58.11(B[m [31mTEST-UNEXPECTED-FAIL(B[m | check_stdcxx | We do not want these libstdc++ symbols to be used: 78[1G[J[34m19:58.12(B[m 0000000000000000 DO *UND* 0000000000000000 GLIBCXX_3.4.10 _ZNSt8__detail12__prime_listE
Flags: needinfo?(kdavis)
Depends on: 1175323
Depends on: 1176616
Andre has decided on another way of approaching the G2P problem of Bug 1171043. Hence, this bug, Bug 1167875, should not be fixed.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.