- #1
Saladsamurai
- 3,020
- 7
This is VBA Excel:
Here is what I am trying to do. I have a MathML file saved as a .txt file. It is simialr to XML.
From the XML file, we have a bunch of text that looks something like:
<mname>Salad</mname><mrow>xyz</mrow>
I would ultimately like to have an array whose elements are the substrings:
Array(1) = <mname>
Array(2) = Salad
Array(3) = </mname>
...So far what I have done is this
Load the entire text file into one continuous string.
I have then fed the string into an array called ElementaryArray such that each individual character of the giant string is an element of the array.
Now I would like to sweep through the array, element by element, and determine the start and end positions of each substring.
If it were simply a bunch of substrings like <mmm><mmm><rrr><rrr><ooo><ooo> it would be easy enough. I could simply find the first "<" and then find its corresponding closing ">" and then restart the loop at the ">" position.
The problem is that not all of the strings start and end with the "<" & ">" characters.
I need a way to determine if the character after a ">" is another "<" or not. And then if it is not another "<" I must mark that character's position and then find the next occurrence of "<" which will be the end position (+1) of the substring that does not start with a "<".Does that all make sense The tricky part is relating the two different cases via the counter such that I do not get any overlap.
Any ideas?
Here is what I am trying to do. I have a MathML file saved as a .txt file. It is simialr to XML.
From the XML file, we have a bunch of text that looks something like:
<mname>Salad</mname><mrow>xyz</mrow>
I would ultimately like to have an array whose elements are the substrings:
Array(1) = <mname>
Array(2) = Salad
Array(3) = </mname>
...So far what I have done is this
Load the entire text file into one continuous string.
I have then fed the string into an array called ElementaryArray such that each individual character of the giant string is an element of the array.
Now I would like to sweep through the array, element by element, and determine the start and end positions of each substring.
If it were simply a bunch of substrings like <mmm><mmm><rrr><rrr><ooo><ooo> it would be easy enough. I could simply find the first "<" and then find its corresponding closing ">" and then restart the loop at the ">" position.
The problem is that not all of the strings start and end with the "<" & ">" characters.
I need a way to determine if the character after a ">" is another "<" or not. And then if it is not another "<" I must mark that character's position and then find the next occurrence of "<" which will be the end position (+1) of the substring that does not start with a "<".Does that all make sense The tricky part is relating the two different cases via the counter such that I do not get any overlap.
Any ideas?