ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Re: [opensource-devs] Interest in the Goverment Gazette

Dear Ioakim,

take a look at this project(
https://ellak.gr/wiki/index.php?title=GSOC2018_Projects#Adding_Greek_language_on_NLP_library_Spacy.io
), it's results may be of interest to your project.

T.K.

2018-03-24 22:04 GMT+02:00 Iraklis Varlamis <varlamis [ at ] gmail [ dot ] com>:

> Dear Ioakim,
> As Dr. Karounos wrote Python provides some helpful libraries both for
> machine learning (scikit learn) as well as for text processing and nlp
> (e.g. nltk). Definitely java can be used in place.
> Candidate references to Greek government entities (these can be the named
> entities, e.g. General Secretariat of ..., or Mayor of ...) in the text can
> be found either using regex and machine learning (once training samples can
> be found) and the same holds for the assigned responsibilities.
> More details can be discussed once the project begins.
> Stanford's CoreNLP https://stanfordnlp.github.io/CoreNLP/ and Apache's
> OpenNLP https://opennlp.apache.org/ are the two tools to check if you are
> going to work with Java.
>
> Iraklis
>
>
>
> On Sat, Mar 24, 2018 at 6:20 PM, Theodoros G. Karounos <
> t [ dot ] karounos [ at ] gmail [ dot ] com> wrote:
>
>> Please find my answers in-line.
>>
>> 2018-03-24 11:08 GMT+02:00 ioaktheo <ioaktheo [ at ] teiser [ dot ] gr>:
>>
>>> Dear Sirs,
>>>
>>> I am writing this email to you with regards to my interest in the
>>> project named «Extraction of Responsibilities per unit in public sector
>>> organizations from the Government Gazette». Having read through the details
>>> of the project I would like to ask some questions so that I can understand
>>> better the requirements. I would be very grateful if you have the time to
>>> answer these questions before I submit my proposal.
>>>
>>> First, I see that the knowledge prerequisites include Python, Java and
>>> Machine Learning. I’m more familiar to Java, Machine learning and Data
>>> mining. I haven’t worked with Python, but I am willing to sit and work with
>>> this language before Google Summer of Code starts.  Is Python going to be
>>> used for Machine learning purposes?
>>>
>> *Python is preferred for machine learning but JAVA does the job as well.*
>>
>>> Secondly, am I right in understanding that Machine Learning is used to
>>> automatically find and match «specific Named Entities types with references
>>> to assigned responsibilities-services per unit and links between the two
>>> must be extracted» is one of the main issues of this project?
>>>
>> *Yes one of the main  tasks of this project is from the text in the PDF's
>> of the law that define the governance of a Greek government entities you
>> should extracted in hierarchical order the assigned
>> responsibilities-services for each unit of that institution. *
>>
>>>
>>> If so, am I right in thinking the steps required include: Preprocessing
>>> the data, Data integration, Hierarchical or partitioned clustering,
>>> Categorization and correlation rules?
>>>
>>
>> *Yes this is the approach in a few words, you should expand it in your
>> project. But we will discuss this extensively with all the mentors(
>> https://ellak.gr/wiki/index.php?title=GSOC2018_Projects#Extraction_of_Responsibilities_per_unit_in_public_sector_organizations_from_the_Government_Gazette
>> <https://ellak.gr/wiki/index.php?title=GSOC2018_Projects#Extraction_of_Responsibilities_per_unit_in_public_sector_organizations_from_the_Government_Gazette>
>> ) once we have the project approved.*
>>
>>>
>>> Finally, I am bit confused about the NER module. Is there any more
>>> information on this subject?
>>>
>> *Please read this( https://nlp.stanford.edu/software/CRF-NER.html
>> <https://nlp.stanford.edu/software/CRF-NER.html> ), there are plenty more
>> resources, search Google Scholar(
>> https://scholar.google.gr/scholar?hl=el&as_sdt=0%2C5&q=Named+Entity+Recognizer&btnG=
>> <https://scholar.google.gr/scholar?hl=el&as_sdt=0%2C5&q=Named+Entity+Recognizer&btnG=>
>> ), etc... *
>>
>>
>>>
>>> Thank you in advance.
>>> Best regards
>>> Ioakeim
>>>
>>>
>>> ----
>>> Λαμβάνετε αυτό το μήνυμα απο την λίστα: Γενική λίστα αλληλογραφίας που
>>> απευθύνεται σε developers/contributors έργων ανοικτού λογισμικού - A
>>> general discussion list for developers/contributors of open-source projects,
>>> https://lists.ellak.gr/opensource-devs/listinfo.html
>>>
>>> Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ.
>>> ταχυδρομείου στη διεύθυνση <opensource-devs+unsubscribe [ at ] ellak [ dot ] gr>.
>>>
>>>
>>
>>
>> --
>> Jiddu Krishnamurti: If we can really understand the problem, the answer
>> will come out of it, because the answer is not separate from the problem.
>>
>> http://karounos.gr/blog/, Key-ID: 85AE3458
>>
>>
>> ----
>> Λαμβάνετε αυτό το μήνυμα απο την λίστα: Γενική λίστα αλληλογραφίας που
>> απευθύνεται σε developers/contributors έργων ανοικτού λογισμικού - A
>> general discussion list for developers/contributors of open-source projects,
>> https://lists.ellak.gr/opensource-devs/listinfo.html
>>
>> Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ.
>> ταχυδρομείου στη διεύθυνση <opensource-devs+unsubscribe [ at ] ellak [ dot ] gr>.
>>
>>
>


-- 
Jiddu Krishnamurti: If we can really understand the problem, the answer
will come out of it, because the answer is not separate from the problem.

http://karounos.gr/blog/, Key-ID: 85AE3458
 
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Γενική λίστα αλληλογραφίας που απευθύνεται σε developers/contributors έργων ανοικτού λογισμικού - A general discussion list for developers/contributors of open-source projects,
https://lists.ellak.gr/opensource-devs/listinfo.html

Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <opensource-devs+unsubscribe [ at ] ellak [ dot ] gr>.

πλοήγηση μηνυμάτων