ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Re: Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions

Subject: Re: Creation of an online Greek mail dictation system, using Sphinx and personalized acoustic/language models training: Some questions
From: Panagiotis Antoniadis <pantoniadis97 [ at ] gmail [ dot ] com>
Date: Tue, 9 Apr 2019 14:40:27 +0300

Hello,

I have submitted my final proposal for the Creation of an online Greek mail
dictation system, using Sphinx and personalized acoustic/language models
training. Thanks for your time and the useful feedback you provided.

Antoniadis Panagiotis


Στις Παρ, 22 Μαρ 2019, 2:34 μ.μ. ο χρήστης Manos Tsardoulias <
etsardou [ at ] eng [ dot ] auth [ dot ] gr> έγραψε:

> Hello Panagiotis,
>
> Indeed the thesis of Mr. Ouzounis had the same concept, but it was build
> for desktop deployment. In the proposed GSoC project the concept remains
> the same, but 1) the overall SW must be changed so as to offer this
> functionality from a web browser, 2) communication/APIs/services must be
> built in order to communicate with the cloud and 3) personalized (or not)
> heavy-weight algorithms could be executed. Of course the Greek language
> involvement changes the whole approach since not many NLP tools exist.
> Conclusively we will aim for the same concept but the methodology can be
> different, since the existing tools are a bit restrictive.
>
> Of course we will aim for resource optimality, since the tool will be
> deployed in real life (so the number of users is unknown) and usually in
> practice, cloud computational resources are limited due to financial
> reasons (whereas in theory cloud is unlimited) :).
>
> Finally, the project can be enhanced with aspects such as real-time (or
> nearly-real-time) acoustic model adaptation, so as to perform better
> personalization in the progress of time.
>
> Best,
> Manos
>
> On Fri, Mar 22, 2019 at 1:25 AM Panagiotis Antoniadis <
> pantoniadis97 [ at ] gmail [ dot ] com> wrote:
>
>> Hello,
>>
>> After reading the diploma thesis of Giwrgos Ouzounis that was suggested,
>> I have the following questions in order to make a proposal that meets all
>> the requirements and the projects' needs. If I got it right, the thesis
>> refers to the same issue as the gsoc project does, which is mail dictation
>> using techniques as personalized acoustic model training and suggested
>> corrections based on a domain-classification. Of course, there are some
>> differences such as the different language and the deployment in the cloud.
>> So, will the project be an extension of this existing project by enhancing
>> the model, integrating the Greek language to it and deploying it in the
>> cloud? Or the thesis is just a reference material and different techniques
>> can be used now that we have the computing power of the cloud. For example,
>> I found out that Kaldi toolkit can be efficient when much computing power
>> is available and more specifically something like this
>> https://github.com/alumae/kaldi-gstreamer-server.
>>
>> Antoniadis Panagiotis
>>
>> Στις Τετ, 20 Μαρ 2019 στις 1:09 μ.μ., ο/η Manos Tsardoulias <
>> etsardou [ at ] eng [ dot ] auth [ dot ] gr> έγραψε:
>>
>>> Hello Panagiotis,
>>>
>>> Thanks for the interest in our proposal. Concerning your questions:
>>>
>>>    - Python/Java is a stack that we are using, nevertheless you are
>>>    free to suggest any tool or programming language you see fit. Also, since
>>>    the project contains both frontend and backend work, more than one
>>>    programming languages will be used.
>>>    - Concerning the corrections in the generated text you are right,
>>>    the generic language model will produce a proposal and the domain/email
>>>    specific model will enhance the proposal by  suggesting alterations. The
>>>    desired output will be the dictated text, annotated with what the system
>>>    finds uncertain/not clear, based on the specific user's history.
>>>
>>> Best,
>>> Manos
>>>
>>> On Tue, Mar 19, 2019 at 11:15 AM Panagiotis Antoniadis <
>>> pantoniadis97 [ at ] gmail [ dot ] com> wrote:
>>>
>>>> Hello,
>>>>
>>>> My name is Panagiotis Antoniadis and I am a 4th-year student of
>>>> Electrical and Computer Engineering in National Technical University of
>>>> Athens. I am interested in the project "Creation of an online Greek mail
>>>> dictation system, using Sphinx and personalized acoustic/language models
>>>> training". I am familiar with the concept of NLP and speech recognition
>>>> after some academic projects and I hope to make a good proposal. These
>>>> days, I read the documentation of Sphinx4 and started writing a draft
>>>> proposal describing the whole procedure and the technologies that can be
>>>> used. Regarding the technologies, I have the following questions:
>>>>
>>>> -  The technologies that will be used (Python or Java) is something
>>>> that is defined or I should present my own view in my proposal? Sphinx4 is
>>>> written in Java and I didn't find a library that provides a python
>>>> interface to it. PocketSphinx exists as well that can be used in Python but
>>>> it is for lightweight purposes only. So, I understand that the whole
>>>> project should be written in Java. But, I believe that the classification
>>>> procedure should be implemented in Python in order to take advantage of all
>>>> its powerful frameworks. Is it possible to use both languages using a
>>>> library that connects them? If all these questions should be defined by me
>>>> in my proposal and not be discussed here, let me know.
>>>>
>>>> - In addition, it is mentioned that "the ASR output text will be fed
>>>> to the NLP (natural language processing) system that, based on the provided
>>>> corpora, will auto-correct or suggest corrections on the (usually
>>>> erroneous) generated text". So, the output of the asr will use the whole
>>>> language model in training and then be corrected using the domain specific
>>>> language model?
>>>>
>>>> Thanks in advance and forgive me for the long text.
>>>>
>>>> --
>>>> Antoniadis Panagiotis
>>>>
>>>> ----
>>>> Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και
>>>> συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του
>>>> Google Summer of Code - A discussion list for student developers and
>>>> mentors of Google Summer of Code projects.,
>>>> https://lists.ellak.gr/gsoc-developers/listinfo.html
>>>>
>>>> Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ.
>>>> ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.
>>>>
>>>
>>>
>>> --
>>> Emmanouil G. Tsardoulias
>>> PhD, Electrical Engineer
>>> Aristotle University of Thessaloniki, Faculty of Engineering
>>> School of Electrical and Computer Engineering
>>> 54124 Thessaloniki, GREECE
>>> Tel.: +302310995922
>>> http://users.auth.gr/etsardou/
>>> http://r4a.issel.ee.auth.gr/
>>>
>>
>>
>> --
>> Αντωνιάδης Παναγιώτης
>>
>
>
> --
> Emmanouil G. Tsardoulias
> PhD, Electrical Engineer
> Aristotle University of Thessaloniki, Faculty of Engineering
> School of Electrical and Computer Engineering
> 54124 Thessaloniki, GREECE
> Tel.: +302310995922
> http://users.auth.gr/etsardou/
> http://r4a.issel.ee.auth.gr/
>

 
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html

Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.

πλοήγηση μηνυμάτων

προηγούμενο ημερολογιακά: GSoC Final Proposal Submitted
επόμενο ημερολογιακά: Draft proposal : Image to 3D
προηγούμενο βάσει θέματος: GSoC Final Proposal Submitted
επόμενο βάσει θέματος: Draft proposal : Image to 3D