Hi Vassilis, thanks for your interest. The rationale for having multiple hash values in the database is purely to facilitate querying. It's the difference between telling a user "ask for a hash value; get the result" and "you should have/install the file; you should have/install software to produce MY favorite hash; you should compute this hash; you can query with this hash". Keep in mind that the first 3 conditions may be actual steps to be performed. Why burden the user? To take your idea to an extreme, I can tel the user: "oh, you have the file; compute its SWHID and check at archive.softwareheritage.org" No need for us to do anything; the functionality is already existing. ;-) On Sat, Mar 27, 2021, at 00:06, Vassilis Xanthopoulos wrote: > Greetings everyone, > > My name is Vassilis Xanthopoulos and I am an undergraduate student at > the National Technical University of Athens currently pursuing a degree > in > Electrical and Computer Engineering. Reading through the project ideas > for GSoC 2021, the hashesDB project caught my eye and I have a question > about a certain aspect of it. > > Is there any benefit in storing multiple hashes in the database for a > single file instead of just one? I have some possible answers in mind > including > > * Using an optimal hash function for each file type > > This doesn't require storing all hashes for all files though, > just the right hash for each file > > * Collision detection > > I believe it's very improbable to find collisions in our data, > provided we use appropriate hash functions (maybe it's even a feature, > since we would like > some locality properties in our hash functions). All around it > feels like a long shot. > > * Providing the flexibility of using various hash function to > digest a file and query the database. > > I consider this option to be a nice-to-have feature rather than a > reason on it's own to add so much redundant information in the > database. > > Thanks in advance, > Vassilis. > > > ---- > Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και > συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του > Google Summer of Code - A discussion list for student developers and > mentors of Google Summer of Code projects., > https://lists.ellak.gr/gsoc-developers/listinfo.html > Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. > ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr > <mailto:gsoc-developers%2Bunsubscribe%40ellak.gr>>. > -- -- zvr -
---- Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects., https://lists.ellak.gr/gsoc-developers/listinfo.html Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.