- #1
TylerH
- 729
- 0
I'm writing a function to take a string like "aXYb" and return a regex in which the lower case letters act like actual character and the upper case become free variables.
The regex generated from "aXYb" should match anything of the form a([a-z]+)([a-z]+)b. It does. But not exactly as "freely" as I would like. For example, when I re.compile(a([a-z]+)([a-z]+)b).match("aaxab"), the only tuple I get back is ("ax", "a").
Preferably, I'd like both ("a", "xa") and ("ax", "a"). How can I change "a([a-z]+)([a-z]+)b" to get these results?
For a little background, it's intended to be used in describing rules for algebraic manipulations (in the abstract algebra sense). The final goal is to be able do a breadth first search of all manipulations until I get to the desired result. The free variables are used in describing the variable part of valid manipulations, like (AB=AC -> B=C). That's why overlapping is absolutely necessary.
The regex generated from "aXYb" should match anything of the form a([a-z]+)([a-z]+)b. It does. But not exactly as "freely" as I would like. For example, when I re.compile(a([a-z]+)([a-z]+)b).match("aaxab"), the only tuple I get back is ("ax", "a").
Preferably, I'd like both ("a", "xa") and ("ax", "a"). How can I change "a([a-z]+)([a-z]+)b" to get these results?
For a little background, it's intended to be used in describing rules for algebraic manipulations (in the abstract algebra sense). The final goal is to be able do a breadth first search of all manipulations until I get to the desired result. The free variables are used in describing the variable part of valid manipulations, like (AB=AC -> B=C). That's why overlapping is absolutely necessary.