ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions

  • Subject: Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions
  • From: Panagiotis Antoniadis <pantoniadis97 [ at ] gmail [ dot ] com>
  • Date: Tue, 19 Mar 2019 00:24:18 +0200
Hello,

My name is Panagiotis Antoniadis and I am a 4th-year student of Electrical
and Computer Engineering in National Technical University of Athens. I am
interested in the project "Creation of an online Greek mail dictation
system, using Sphinx and personalized acoustic/language models training". I
am familiar with the concept of NLP and speech recognition after some
academic projects and I hope to make a good proposal. These days, I read
the documentation of Sphinx4 and started writing a draft proposal
describing the whole procedure and the technologies that can be used.
Regarding the technologies, I have the following questions:

-  The technologies that will be used (Python or Java) is something that is
defined or I should present my own view in my proposal? Sphinx4 is written
in Java and I didn't find a library that provides a python interface to it.
PocketSphinx exists as well that can be used in Python but it is for
lightweight purposes only. So, I understand that the whole project should
be written in Java. But, I believe that the classification procedure should
be implemented in Python in order to take advantage of all its powerful
frameworks. Is it possible to use both languages using a library that
connects them? If all these questions should be defined by me in my
proposal and not be discussed here, let me know.

- In addition, it is mentioned that "the ASR output text will be fed to the
NLP (natural language processing) system that, based on the provided
corpora, will auto-correct or suggest corrections on the (usually
erroneous) generated text". So, the output of the asr will use the whole
language model in training and then be corrected using the domain specific
language model?

Thanks in advance and forgive me for the long text.

-- 
Antoniadis Panagiotis
 
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html

Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.

πλοήγηση μηνυμάτων