To whom it may concern, and especially to Dr. Alexios Zavras,
Hello, my name is Lampros Avouris. I am a fourth-year electrical
engineering and computer technologies student at the University of
Patras.
I am looking to participate in the 2025 GSOC, and the Exploring and
Abstracting Triplestore Alternatives project has especially piqued my
interest.
I have looked into the project for quite a bit and have some questions
so that I am able to create the best possible project proposal.
Firstly: When referring to Triplestore alternatives, are we talking
exclusively about systems with native RDF triple store support like
Apache Jena, Stardog, etc., or will it be useful to explore the
implementation of triplestores in alternative technologies not
purpose-built for triplestores, like adapting graph databases/multimodal
databases like Amazon Neptune or ArangoDB or even implementing
triplestores in traditional databases like PostgreSQL or SQLite?
My first thought about this was to create at minimum a single case for
each for our comparisons and expand from there if possible/necessary,
focusing initialy on technologies that have RDFLib support and moving on
from there.
Secondly, when it comes to querying our triplestores, am I correct in
understanding that the query language we will support is SparkQL only?
As in the user will query in SparkQL and not any native query language
like Gremlin or GraphQL.
Thirdly, in terms of the project structure, the way that I imagined it
is the following: We have an abstraction layer accessible by the user as
described. Within we have a TriplestoreType class, which would serve as
an enumeration of all the types of triplestores we support, and a
triplestoreFactory class, which would create the corresponding required
triplestore implementation as we define it in a separate implementation
class, e.g., RDFLibTriplestore for triplestores that can be implemented
using the rdflib python library, etc. Finally we would have a manager
class that would serve as a unified interface. The methods I think each
implementation should have at minimum are methods to add any number of
triples, query triples, and remove triples as well as execute SPARQL
queries and write and read from files. I am open to and would appreciate
any proposals from you about expanding these methods and would also
appreciate any criticism about my implementation and where it could be
improved.
With regard to the benchmarking, what are the exact specific cases you
are looking for? Simple accesses of the DB, more complex operations?
I assume the criteria will be time and RAM usage.
Fourthly, I have already seen that you would appreciate a weekly or at
least a biweekly schedule. Could you please get more specific about
exactly what you would require in your scheduling, i.e., would you have
a problem with taking some time off for exam season, how quickly you
would like to get to each project goal, etc.? Additionally, I assume
that you would like to see a domain diagram of the project.
Regarding scheduling I am also hoping to participate in the code in
place program as an instructor. I am 99% certain that it won't cause a
scheduling conflict in terms of our meetings, but I thought I should
mention it in advance as to not surprise you later.
Fifthly?: The GSoC site mentions that we should ask for any special
requirements about crafting our application; would you happen to have
any of those?
I apologize for the large length of the message and thank you in
advance.
Yours sincerely,
Lampros Avouris
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html
Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.