- #1
A_Seagull
- 8
- 0
An algorithm for data compression?
I am coming from a philosophy and science background and I'm particularly interested in finding out what a generalised data compression algorithm would look like. (By algorithm I mean something like a flowchart that would show how this process could be done).
Effectively, much of science is about data compression. For example 'F=ma' is a means of greatly compressing data concerning an object moving in a force field.
Also, one of the brains intuitive processes is one of data compression. For example, it can take a multitude of sense-data and compress it into the simple string 'tree'.
So I am interested in how this could be achieved at a fundamental level. I am seeking a generalised algorithm that could, in principle and at a much later date, be coded into a computer language and run on a computer.
I have had several attempts myself to create such an algorithm but without much success.
I suspect that it may be very hard to find an entirely generalised algorithm that could efficiently compress any compressible string of data. Instead, it may be necessary to start with some form of 'template' to initiate the data compression process.
So for the purpose of the exercise, I suggest that one consider as input a string of integers. This input string could be considered to be infinite in length, but for the purpose of the exercise some cut-off would undoubtedly be necessary.
The output of the algorithm would contain some brief description of the ' pattern' that had been found and a process for recreating the original data string.
Clearly the input data cannot be random, otherwise no data compression would be possible. It must contain some form of 'pattern', even if it is only an approximate pattern.
Perhaps initially it can be assumed that the data can be described exactly by a pattern, in other words that there are no 'errors' in the data. (However, subsequently it would be interesting to see how an algorithm could cope with some data points not fitting the pattern perfectly, as this would be more 'realistic'.
There would also need to be some provision for criteria for the algorithm to halt and produce its answer.
There may already be in existence some such algorithm, but if so I have not uncovered it.
Any suggestions? Any solutions?
I am coming from a philosophy and science background and I'm particularly interested in finding out what a generalised data compression algorithm would look like. (By algorithm I mean something like a flowchart that would show how this process could be done).
Effectively, much of science is about data compression. For example 'F=ma' is a means of greatly compressing data concerning an object moving in a force field.
Also, one of the brains intuitive processes is one of data compression. For example, it can take a multitude of sense-data and compress it into the simple string 'tree'.
So I am interested in how this could be achieved at a fundamental level. I am seeking a generalised algorithm that could, in principle and at a much later date, be coded into a computer language and run on a computer.
I have had several attempts myself to create such an algorithm but without much success.
I suspect that it may be very hard to find an entirely generalised algorithm that could efficiently compress any compressible string of data. Instead, it may be necessary to start with some form of 'template' to initiate the data compression process.
So for the purpose of the exercise, I suggest that one consider as input a string of integers. This input string could be considered to be infinite in length, but for the purpose of the exercise some cut-off would undoubtedly be necessary.
The output of the algorithm would contain some brief description of the ' pattern' that had been found and a process for recreating the original data string.
Clearly the input data cannot be random, otherwise no data compression would be possible. It must contain some form of 'pattern', even if it is only an approximate pattern.
Perhaps initially it can be assumed that the data can be described exactly by a pattern, in other words that there are no 'errors' in the data. (However, subsequently it would be interesting to see how an algorithm could cope with some data points not fitting the pattern perfectly, as this would be more 'realistic'.
There would also need to be some provision for criteria for the algorithm to halt and produce its answer.
There may already be in existence some such algorithm, but if so I have not uncovered it.
Any suggestions? Any solutions?