Please find my answers in-line. 2018-03-24 11:08 GMT+02:00 ioaktheo <ioaktheo [ at ] teiser [ dot ] gr>: > Dear Sirs, > > I am writing this email to you with regards to my interest in the project > named «Extraction of Responsibilities per unit in public sector > organizations from the Government Gazette». Having read through the details > of the project I would like to ask some questions so that I can understand > better the requirements. I would be very grateful if you have the time to > answer these questions before I submit my proposal. > > First, I see that the knowledge prerequisites include Python, Java and > Machine Learning. I’m more familiar to Java, Machine learning and Data > mining. I haven’t worked with Python, but I am willing to sit and work with > this language before Google Summer of Code starts. Is Python going to be > used for Machine learning purposes? > *Python is preferred for machine learning but JAVA does the job as well.* > Secondly, am I right in understanding that Machine Learning is used to > automatically find and match «specific Named Entities types with references > to assigned responsibilities-services per unit and links between the two > must be extracted» is one of the main issues of this project? > *Yes one of the main tasks of this project is from the text in the PDF's of the law that define the governance of a Greek government entities you should extracted in hierarchical order the assigned responsibilities-services for each unit of that institution. * > > If so, am I right in thinking the steps required include: Preprocessing > the data, Data integration, Hierarchical or partitioned clustering, > Categorization and correlation rules? > *Yes this is the approach in a few words, you should expand it in your project. But we will discuss this extensively with all the mentors( https://ellak.gr/wiki/index.php?title=GSOC2018_Projects#Extraction_of_Responsibilities_per_unit_in_public_sector_organizations_from_the_Government_Gazette <https://ellak.gr/wiki/index.php?title=GSOC2018_Projects#Extraction_of_Responsibilities_per_unit_in_public_sector_organizations_from_the_Government_Gazette> ) once we have the project approved.* > > Finally, I am bit confused about the NER module. Is there any more > information on this subject? > *Please read this( https://nlp.stanford.edu/software/CRF-NER.html <https://nlp.stanford.edu/software/CRF-NER.html> ), there are plenty more resources, search Google Scholar( https://scholar.google.gr/scholar?hl=el&as_sdt=0%2C5&q=Named+Entity+Recognizer&btnG= <https://scholar.google.gr/scholar?hl=el&as_sdt=0%2C5&q=Named+Entity+Recognizer&btnG=> ), etc... * > > Thank you in advance. > Best regards > Ioakeim > > > ---- > Λαμβάνετε αυτό το μήνυμα απο την λίστα: Γενική λίστα αλληλογραφίας που > απευθύνεται σε developers/contributors έργων ανοικτού λογισμικού - A > general discussion list for developers/contributors of open-source projects, > https://lists.ellak.gr/opensource-devs/listinfo.html > > Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. > ταχυδρομείου στη διεύθυνση <opensource-devs+unsubscribe [ at ] ellak [ dot ] gr>. > > -- Jiddu Krishnamurti: If we can really understand the problem, the answer will come out of it, because the answer is not separate from the problem. http://karounos.gr/blog/, Key-ID: 85AE3458
---- Λαμβάνετε αυτό το μήνυμα απο την λίστα: Γενική λίστα αλληλογραφίας που απευθύνεται σε developers/contributors έργων ανοικτού λογισμικού - A general discussion list for developers/contributors of open-source projects, https://lists.ellak.gr/opensource-devs/listinfo.html Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <opensource-devs+unsubscribe [ at ] ellak [ dot ] gr>.