- #1
tickle_monste
- 69
- 1
1. Homework Statement
I posted this on Calculus & Beyond because all of this came from a math idea, but I realize now that it belongs in this section. Given that I have a set W, with a multitude of subsets w1...wn, with arbitrary intersections, worst-case-scenario-unordered, I want to know what would be a good representation in a hash table. Basically I want to have things like A[tex]\cup[/tex]B, A[tex]\cap[/tex] B, A - B, etc (the basic set theory operations) have an associated hash-table address where a pointer to exactly that set is stored, and want to know how to quantify the limitations of such a data-structure.
2. Homework Equations
3. The Attempt at a Solution
I'll give an example to motivate my solution, with just two subsets of W: A and B, which intersect.
W can be divided into the following partitions:
W - A - B
A-B
B-A
A[tex]\cap[/tex]B
None of these subsets intersect with each other, and there's no redundant data. Now only metadata can be redundant. What I mean is that when I specify the set A, this would translate to the hash-table as [A-B][tex]\cup[/tex][A[tex]\cap[/tex]B], and B would translate to [B-A][tex]\cup[/tex][A[tex]\cap[/tex]B] i.e. meta-data is references to non-redundant partitions.
Given that there are N objects in W, with enough subsets W could be divided into a maximum of N partitions, so while there is no redundant data, the metadata becomes increasingly bulky and redundant.
I could reverse this and have 0-redundancy metadata, with redundant data. I am wondering what are the limitations of both, and what compromises between the two can be made. Mixing the representations would add another layer of metadata, and I am not quite sure how to go about that, or if this if I'm going down the right path with this.
I posted this on Calculus & Beyond because all of this came from a math idea, but I realize now that it belongs in this section. Given that I have a set W, with a multitude of subsets w1...wn, with arbitrary intersections, worst-case-scenario-unordered, I want to know what would be a good representation in a hash table. Basically I want to have things like A[tex]\cup[/tex]B, A[tex]\cap[/tex] B, A - B, etc (the basic set theory operations) have an associated hash-table address where a pointer to exactly that set is stored, and want to know how to quantify the limitations of such a data-structure.
2. Homework Equations
3. The Attempt at a Solution
I'll give an example to motivate my solution, with just two subsets of W: A and B, which intersect.
W can be divided into the following partitions:
W - A - B
A-B
B-A
A[tex]\cap[/tex]B
None of these subsets intersect with each other, and there's no redundant data. Now only metadata can be redundant. What I mean is that when I specify the set A, this would translate to the hash-table as [A-B][tex]\cup[/tex][A[tex]\cap[/tex]B], and B would translate to [B-A][tex]\cup[/tex][A[tex]\cap[/tex]B] i.e. meta-data is references to non-redundant partitions.
Given that there are N objects in W, with enough subsets W could be divided into a maximum of N partitions, so while there is no redundant data, the metadata becomes increasingly bulky and redundant.
I could reverse this and have 0-redundancy metadata, with redundant data. I am wondering what are the limitations of both, and what compromises between the two can be made. Mixing the representations would add another layer of metadata, and I am not quite sure how to go about that, or if this if I'm going down the right path with this.