Hi Alexios, I’ve been trying to load multiple SBOMs into the same Jena dataset to understand how reuse would actually work. I now understand why keeping each SBOM in a separate named graph doesn’t really help with reuse. So I started thinking about how we should handle package identity across SBOMs before inserting data. Since SPDX IDs are document-scoped, they don’t seem useful for deduplication across different SBOMs. So I was wondering if the abstraction layer should derive some kind of canonical identity. For example, should we: Use purl when it exists, Otherwise fall back to name + version + ecosystem I was thinking about a simple case like: pkg:deb/debian/openssl@1.1.1k pkg:docker/library/openssl@1.1.1k Here the name and version are the same, but the ecosystem is different. In this case, would these be treated as two separate packages, or should they somehow map to a common canonical entity? Also, I’m unsure about the modeling approach, Should we keep SBOM-specific package nodes and link them to a global canonical package node? Or should we directly merge them if identity matches? I’m trying to understand whether identity resolution should be handled explicitly in the abstraction layer instead of relying on whatever merging the RDF store does automatically. Would appreciate your guidance on what identity model you think fits best for this project. Best regards, Manav Gupta
---- Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects., https://lists.ellak.gr/gsoc-developers/listinfo.html Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.