ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Re: Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions

  • Subject: Re: Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions
  • From: Manos Tsardoulias <etsardou [ at ] eng [ dot ] auth [ dot ] gr>
  • Date: Fri, 22 Mar 2019 14:34:08 +0200
Hello Panagiotis,

Indeed the thesis of Mr. Ouzounis had the same concept, but it was build
for desktop deployment. In the proposed GSoC project the concept remains
the same, but 1) the overall SW must be changed so as to offer this
functionality from a web browser, 2) communication/APIs/services must be
built in order to communicate with the cloud and 3) personalized (or not)
heavy-weight algorithms could be executed. Of course the Greek language
involvement changes the whole approach since not many NLP tools exist.
Conclusively we will aim for the same concept but the methodology can be
different, since the existing tools are a bit restrictive.

Of course we will aim for resource optimality, since the tool will be
deployed in real life (so the number of users is unknown) and usually in
practice, cloud computational resources are limited due to financial
reasons (whereas in theory cloud is unlimited) :).

Finally, the project can be enhanced with aspects such as real-time (or
nearly-real-time) acoustic model adaptation, so as to perform better
personalization in the progress of time.

Best,
Manos

On Fri, Mar 22, 2019 at 1:25 AM Panagiotis Antoniadis <
pantoniadis97 [ at ] gmail [ dot ] com> wrote:

> Hello,
>
> After reading the diploma thesis of Giwrgos Ouzounis that was suggested, I
> have the following questions in order to make a proposal that meets all the
> requirements and the projects' needs. If I got it right, the thesis refers
> to the same issue as the gsoc project does, which is mail dictation using
> techniques as personalized acoustic model training and suggested
> corrections based on a domain-classification. Of course, there are some
> differences such as the different language and the deployment in the cloud.
> So, will the project be an extension of this existing project by enhancing
> the model, integrating the Greek language to it and deploying it in the
> cloud? Or the thesis is just a reference material and different techniques
> can be used now that we have the computing power of the cloud. For example,
> I found out that Kaldi toolkit can be efficient when much computing power
> is available and more specifically something like this
> https://github.com/alumae/kaldi-gstreamer-server.
>
> Antoniadis Panagiotis
>
> Στις Τετ, 20 Μαρ 2019 στις 1:09 μ.μ., ο/η Manos Tsardoulias <
> etsardou [ at ] eng [ dot ] auth [ dot ] gr> έγραψε:
>
>> Hello Panagiotis,
>>
>> Thanks for the interest in our proposal. Concerning your questions:
>>
>>    - Python/Java is a stack that we are using, nevertheless you are free
>>    to suggest any tool or programming language you see fit. Also, since the
>>    project contains both frontend and backend work, more than one programming
>>    languages will be used.
>>    - Concerning the corrections in the generated text you are right, the
>>    generic language model will produce a proposal and the domain/email
>>    specific model will enhance the proposal by  suggesting alterations. The
>>    desired output will be the dictated text, annotated with what the system
>>    finds uncertain/not clear, based on the specific user's history.
>>
>> Best,
>> Manos
>>
>> On Tue, Mar 19, 2019 at 11:15 AM Panagiotis Antoniadis <
>> pantoniadis97 [ at ] gmail [ dot ] com> wrote:
>>
>>> Hello,
>>>
>>> My name is Panagiotis Antoniadis and I am a 4th-year student of
>>> Electrical and Computer Engineering in National Technical University of
>>> Athens. I am interested in the project "Creation of an online Greek mail
>>> dictation system, using Sphinx and personalized acoustic/language models
>>> training". I am familiar with the concept of NLP and speech recognition
>>> after some academic projects and I hope to make a good proposal. These
>>> days, I read the documentation of Sphinx4 and started writing a draft
>>> proposal describing the whole procedure and the technologies that can be
>>> used. Regarding the technologies, I have the following questions:
>>>
>>> -  The technologies that will be used (Python or Java) is something that
>>> is defined or I should present my own view in my proposal? Sphinx4 is
>>> written in Java and I didn't find a library that provides a python
>>> interface to it. PocketSphinx exists as well that can be used in Python but
>>> it is for lightweight purposes only. So, I understand that the whole
>>> project should be written in Java. But, I believe that the classification
>>> procedure should be implemented in Python in order to take advantage of all
>>> its powerful frameworks. Is it possible to use both languages using a
>>> library that connects them? If all these questions should be defined by me
>>> in my proposal and not be discussed here, let me know.
>>>
>>> - In addition, it is mentioned that "the ASR output text will be fed to
>>> the NLP (natural language processing) system that, based on the provided
>>> corpora, will auto-correct or suggest corrections on the (usually
>>> erroneous) generated text". So, the output of the asr will use the whole
>>> language model in training and then be corrected using the domain specific
>>> language model?
>>>
>>> Thanks in advance and forgive me for the long text.
>>>
>>> --
>>> Antoniadis Panagiotis
>>>
>>> ----
>>> Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και
>>> συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του
>>> Google Summer of Code - A discussion list for student developers and
>>> mentors of Google Summer of Code projects.,
>>> https://lists.ellak.gr/gsoc-developers/listinfo.html
>>>
>>> Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ.
>>> ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.
>>>
>>
>>
>> --
>> Emmanouil G. Tsardoulias
>> PhD, Electrical Engineer
>> Aristotle University of Thessaloniki, Faculty of Engineering
>> School of Electrical and Computer Engineering
>> 54124 Thessaloniki, GREECE
>> Tel.: +302310995922
>> http://users.auth.gr/etsardou/
>> http://r4a.issel.ee.auth.gr/
>>
>
>
> --
> Αντωνιάδης Παναγιώτης
>


-- 
Emmanouil G. Tsardoulias
PhD, Electrical Engineer
Aristotle University of Thessaloniki, Faculty of Engineering
School of Electrical and Computer Engineering
54124 Thessaloniki, GREECE
Tel.: +302310995922
http://users.auth.gr/etsardou/
http://r4a.issel.ee.auth.gr/
 
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html

Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.