- #1
the.drizzle
- 10
- 0
Hiya! Long time registrant, have not used these forums is years though...
Anyhow, the problem I have is that I need to collect data from a very large number of images that are essentially horizontal lines on a page. I've attached a sample here, which has been processed to find the edges of the lines--the attached file shows the edges of nine lines, unevenly spaced. I want to measure the width of each line over all pixels, and export that data to an external file for later processing.
In theory this would be quite simple; start at the top (or bottom) of the page, scan down (up) until one hits a red pixel, record that position, record the position of the next one, and the difference gives us the width for that line at that position. We then continue down (up) for all n-lines, and across the page as well.
The tricky bit though is that images are noisy. That is, there are small "holes" in the lines and a few outside them as well. What I need to do is figure out a method of deciding which bits are noise, and which bits are not. I can't manually delete the "holes" from the data set, as there are literally thousands of these to process, and that would take weeks.
I'm thinking some sort of pattern recognition algorithm would be a good idea, but I'm not sure how they work...
Thus, what I'm looking for here are some possible ideas as to how one might go about filtering the noise from the image in some manner, or pehaps an algorithm that may be able to decide which points are line and which ones are noise when scanning the image. I've been at this for some time now, and am running out of ideas...
Thanks in advance for any help!
Anyhow, the problem I have is that I need to collect data from a very large number of images that are essentially horizontal lines on a page. I've attached a sample here, which has been processed to find the edges of the lines--the attached file shows the edges of nine lines, unevenly spaced. I want to measure the width of each line over all pixels, and export that data to an external file for later processing.
In theory this would be quite simple; start at the top (or bottom) of the page, scan down (up) until one hits a red pixel, record that position, record the position of the next one, and the difference gives us the width for that line at that position. We then continue down (up) for all n-lines, and across the page as well.
The tricky bit though is that images are noisy. That is, there are small "holes" in the lines and a few outside them as well. What I need to do is figure out a method of deciding which bits are noise, and which bits are not. I can't manually delete the "holes" from the data set, as there are literally thousands of these to process, and that would take weeks.
I'm thinking some sort of pattern recognition algorithm would be a good idea, but I'm not sure how they work...
Thus, what I'm looking for here are some possible ideas as to how one might go about filtering the noise from the image in some manner, or pehaps an algorithm that may be able to decide which points are line and which ones are noise when scanning the image. I've been at this for some time now, and am running out of ideas...
Thanks in advance for any help!