Algorithm creates representative set of data

mundek88 · May 23, 2016

Hi all,
I have algorithm to analyze and make it easier to implement in programming language (Python). We have table with data and we want to select only representative part.

It looks like:
ID_PRODUCT | CARDINALITY | SET VARIANCE WITH THIS ELEMENT AND ABOVE
10 ---------------- 110 --------------- 400
11 ---------------- 90 ---------------- 350
12 ---------------- 80 ---------------- 300
... --------------- ... ---------------- ...

* variance is calculated for cardinality columnAlgorithm works as follows:
Iterate over rows from the top of table and in each loop add new row and count variance for cardinality column. Stop iteratation if variance is equal or less than specified (so, finally we want to produce set of rows with variance bigger than X) and then return created (now representative) set

Question:
This is legacy solution and hard to say for me how we can do it better. Is there any math tool which cut away elements hardly representative? We can not statically based on the cardinality (like: just give rows with cardinality > 50) because the day-to-day can change the order of magnitude.

Thanks in advice!

jvicens · May 23, 2016

Hi there,

Thank you for sharing your algorithm for analyzing and selecting representative data. It seems like you have a good understanding of the problem and have come up with a solution that works for your needs.

In terms of potential improvements, there are a few things you could consider. First, you could try using statistical methods such as standard deviation or mean absolute deviation to determine the variance threshold instead of a fixed value. This would allow for more flexibility in selecting representative data.

Another approach could be to use machine learning techniques to identify patterns in the data and determine which rows are most representative. This would require training a model on a large dataset and then using it to make predictions on new data.

Overall, there are many mathematical and statistical tools that could potentially help with selecting representative data, but the best approach will depend on the specific characteristics of your data and the problem you are trying to solve. I would recommend doing some research and experimenting with different methods to see which one works best for your situation.

Best of luck with your analysis and programming!

Algorithm creates representative set of data

FAQ: Algorithm creates representative set of data

What is an algorithm?

How does an algorithm create a representative set of data?

Why is it important for an algorithm to create a representative set of data?

What are some common challenges in creating a representative set of data?

Are there different types of algorithms used to create a representative set of data?

Similar threads

Hot Threads

Recent Insights