- #1
hoffmann
- 70
- 0
Say I have two lists, List1 and List2 containing elements such as words. Some words are common two both List1 and List2. I want to create a distance metric that tells me how far apart the two lists are based on a similarity "score". The similarity score and distance metric are as follows:
Similarity score: Intersection(List1,List2) / Union(List1,List2)
Distance = 1 - Similarity Score
In other words, the similarity score is just the percentage overlap between the two lists and the distance is 0 when the two lists are the same. Say I generalize this to n lists and I calculate the distances between lists (a symmetric matrix of distances). My question is, is this distance formula valid? In other words, does it satisfy the triangle inequality? How do I check this?
Similarity score: Intersection(List1,List2) / Union(List1,List2)
Distance = 1 - Similarity Score
In other words, the similarity score is just the percentage overlap between the two lists and the distance is 0 when the two lists are the same. Say I generalize this to n lists and I calculate the distances between lists (a symmetric matrix of distances). My question is, is this distance formula valid? In other words, does it satisfy the triangle inequality? How do I check this?