- #1
FrostScYthe
- 80
- 0
Hi everyone,
I'm seeking some advice on how to approach this problem, I have a query that I need to tokenize. Here are a couple of examples
(animal cat & food meat) | (animal cow) into the following array:
[ ( ][ animal ][ cat ][ & ][ food ][ meat ][ ) ][ | ][ ( ][ animal ][ cow ][ ) ]
- I want to treat all consecutive whitespace (one or more) as a delimiter.
- However the parenthesis, even though not sepparated by whitespace, should be treated as a sepparate token.
I'm wondering can this be done with regular expressions, using boost. I am using this for now, but it only works for 1 whitespace as a delimiter.
Or maybe there's an easier approach I don't know, feel free to throw in ideas :)
I'm seeking some advice on how to approach this problem, I have a query that I need to tokenize. Here are a couple of examples
(animal cat & food meat) | (animal cow) into the following array:
[ ( ][ animal ][ cat ][ & ][ food ][ meat ][ ) ][ | ][ ( ][ animal ][ cow ][ ) ]
- I want to treat all consecutive whitespace (one or more) as a delimiter.
- However the parenthesis, even though not sepparated by whitespace, should be treated as a sepparate token.
I'm wondering can this be done with regular expressions, using boost. I am using this for now, but it only works for 1 whitespace as a delimiter.
Code:
boost::regex re(" ");
boost::sregex_token_iterator
p(query.begin( ), query.end( ), re, -1);
Or maybe there's an easier approach I don't know, feel free to throw in ideas :)