ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Re: Interest and some Questions about "Exploring and Abstracting Triplestore Alternatives" Project

Subject: Re: Interest and some Questions about "Exploring and Abstracting Triplestore Alternatives" Project
From: "Alexios Zavras" <zvr+eellak [ at ] zvr [ dot ] gr>
Date: Fri, 21 Mar 2025 14:43:41 +0100

Maira, please use the mailing list [added in cc],
so that others may see the exchange and get some info.
I've redacted your proposal text.

The proposal should definitely include a workplan / timeline.
Ideally it should be in weekly granularity, but two-weeks
is also acceptable. This is the only way of keeping track
of the progress of the work.

To your specific question about triplestrores alternatives to explore:
the set you propose is rather exhaustive. I can only think of MillenniumDB
and probably KuzuDB to add as alternatives.
It may not be feasible to explore all of these in depth.
I would suggest that you take a look at each one of them,
see which you are comfortable with (I mean, install, try them out)
and focus first on these.
It's better to end up with a working library for a few
than an incomplete library trying to handle everything.
This is another point where your workplan will guide you.

On the API design, I admit I was surprised by your idea
to use a REST API (and Flask to implement it).
I don't outright discard this idea, but it has to be justified somehow.

When designing the API, start by thinking what functionality
the program using the library will require -- that's what you have
to provide.
Let's see: it will obviously need to (a) connect to a datastore
and (b) execute commands (queries or other).

Think of SQLAlchemy (which abstracts SQL database access)
and the first primitives it provides:
    engine = create_engine(db_info)
    with engine.connect() as c:
        result = c.execute(commands)
These correspond to (a) and (b) above, since the required functionality
is the same.

Going further, you might think of providing more functionality
like bulk load of data (instead of the user doing a number of INSERT
statements themselves).

By thinking this way, you should get to a list of primitives/functions
that your library should provide.

Implementing those is the next step. They will be simple function calls
to the user. Whether you choose to implement them via a REST interface
via HTTP to a server process is an implementation decision for you.
I personally find this too complex.

My modeling would be: I expose an API with a connect() function.
I then implement this for different back-end triplestores:
connect_blazegraph(), connect_jena(), connect_millenium(), etc.
and the function decides which one to call.
The same with execute() -- and that's all.

Doing the OO way, there will be a general TripleStore() class
(with methods connect() and execute()) and various implementation
classes Jena(TripleSrore) with their own methods that actually talk
to specific back end.

Hope this helps,

On Thu, Mar 20, 2025, at 12:48, Maira Papadopoulou wrote:
> Dear Alexios Zavras,
>
> I hope this email finds you well. I have begun drafting my proposal for 
> the project and would appreciate any feedback or guidance you can 
> provide to help refine my approach.
>
> [...]
> 
> Questions
> This is my proposal so far, but I have a few additional questions. 
> First, is selecting Blazegraph, Virtuoso, GraphDB, Apache Jena TDB, and 
> Stardog as triplestore alternatives sufficient, or would it be better 
> to include more or fewer options? Since I’m still learning about best 
> practices in API development, I would greatly appreciate any guidance 
> or recommendations you can offer regarding this. Specifically, are 
> there any design patterns that would be particularly useful for 
> creating an abstraction layer that works seamlessly across different 
> triplestore alternatives? Additionally, should I follow a typical REST 
> approach (using POST, GET, PUT, DELETE), or is there a more suitable 
> method for interacting with triplestore databases over HTTP? Finally, 
> do you think this approach to the project is sound and are there any 
> concerns or improvements you would suggest? Any guidance you can offer 
> would be greatly appreciated.
>
> Looking forward to your response.
>
>  Best regards,
>  Maira Papadopoulou   
>
> Στις Τετ 12 Μαρ 2025 στις 3:27 μ.μ., ο/η Alexios Zavras 
> <zvr+eellak [ at ] zvr [ dot ] gr <mailto:zvr%2Beellak [ at ] zvr [ dot ] gr>> έγραψε:
>> Thanks for your interest in the project, Maira.
>> 
>> I don't have much to add, and I'm looking forward
>> receiving your application.
>> Keep in mind that for testing and analysis parts,
>> I can also provide data (more than enough!),
>> so that you can work with real-world data
>> and not only synthetic ones.
>> 
>> On Wed, Mar 12, 2025, at 11:40, Maira Papadopoulou wrote:
>> > Dear Alexios Zavras,
>> >
>> > I hope this email finds you well. My name is Maira and I am a 
>> > third-year undergraduate student at the Department of Informatics and 
>> > Telecommunications in National and Kapodistrian University of Athens. I 
>> > recently came across the "Exploring and Abstracting Triplestore 
>> > Alternatives" project for Google Summer of Code 2025, and I am very 
>> > interested in contributing to it. I find the idea of analyzing and 
>> > developing an abstraction layer for various triplestore alternatives 
>> > both fascinating and impactful for the semantic web, especially in the 
>> > context of making RDF-based data management more accessible to 
>> > developers.
>> >
>> > Regarding the knowledge prerequisites, I have experience in Python, C, 
>> > which I believe will be valuable for both the implementation and 
>> > performance evaluation aspects of the project. Although I have no 
>> > direct experience with SPARQL, I have experience with SQL also, and 
>> > after researching SPARQL’s syntax, I noticed that its basic structure 
>> > is quite similar to SQL. Given this similarity, I believe I can adapt 
>> > to SPARQL quickly and effectively.
>> >
>> > To effectively contribute to the project, I plan to follow a structured 
>> > approach aligned with the methodology in the Contributor's Guidance:
>> > a) For the 'Research' part, I will begin by studying different 
>> > triplestore alternatives, such as Blazegraph, Virtuoso, GraphDB and 
>> > Stardog analyzing their architectures, query execution models, and 
>> > storage strategies.
>> > b) For the 'Testing' part, I will set up and run some basic SPARQL 
>> > queries on these different triplestores, testing their performance 
>> > under different conditions, such as varying dataset sizes.
>> > c) For the 'Analysis' part, I will identify advantages and weaknesses 
>> > of each alternative by benchmarking execution times and memory usage.
>> > d) For the 'Develop' part,  I would greatly appreciate further 
>> > clarification on the expected functionality of the Python library. 
>> > Understanding its intended role, whether it should primarily serve as a 
>> > middleware for processing and routing SPARQL queries or include 
>> > additional optimization features like translating RDF data to Python, 
>> > would help me strengthen my application.
>> > e) For the 'Documentation' part, once I gain a comprehensive 
>> > understanding of the implementation of the library, I will ensure that 
>> > the library is thoroughly documented so that other developers can 
>> > easily integrate and utilize the abstraction layer.
>> >
>> > I would love the opportunity to discuss this project further and 
>> > understand how I can best contribute. Please let me know if there are 
>> > any additional resources that would be helpful for me to review.
>> >
>> > Looking forward to your response.
>> >
>> > Best regards,
>> > Maira Papadopoulou
>> > ----
>> > Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και 
>> > συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του 
>> > Google Summer of Code - A discussion list for student developers and 
>> > mentors of Google Summer of Code projects.,
>> > https://lists.ellak.gr/gsoc-developers/listinfo.html
>> > Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. 
>> > ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr <mailto:gsoc-developers%2Bunsubscribe [ at ] ellak [ dot ] gr> 
>> > <mailto:gsoc-developers%2Bunsubscribe [ at ] ellak [ dot ] gr <mailto:gsoc-developers%252Bunsubscribe [ at ] ellak [ dot ] gr>>>.
>> 
>> -- 
>> -- zvr -

-- 
-- zvr -

----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html
Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.

Re: Interest and some Questions about "Exploring and Abstracting Triplestore Alternatives" Project

απαντήσεις

αναφορές

πλοήγηση μηνυμάτων