regex - Lookarounds in python -
i have problem regarding lookarounds in python:
>>> spacereplace = re.compile(b'(?<!\band)(?<!\bor)\s(?!or\b)(?!and\b)', re.i) >>> q = "a b (c or d)" >>> q = spacereplace.sub(" , ", q) >>> q # meant happen: 'a , b , (c or d)' # instead happens 'a , b , (c , or , d)'
the regex supposed match space not next words "and" or "or", doesn't seem working.
can me this?
edit: in response commentor, broken down regex multiple lines.
(?<!\band) # looks behind \s, matching if there isn't word break, followed "and", there. (?<!\bor) # looks behind \s, matching if there isn't word break, followed "or", there. \s # matches single whitespace character. (?!or\b) # looks after \s, matching if there isn't word "or", followed word break there. (?!and\b) # looks after \s, matching if there isn't word "and", followed word break there.
you presumably confused raw string modifier r
b
.
>>> import re >>> spacereplace = re.compile(r'(?<!\band)(?<!\bor)\s(?!or\b)(?!and\b)', re.i) >>> q = "a b (c or d)" >>> spacereplace.sub(" , ", q) 'a , b , (c or d)'
sometimes, if regexp doesn't work, may debug
re.debug
flag. in case doing may notice, word boundary \b
not detected, may give hint search mistake:
>>> spacereplace = re.compile(b'(?<!\band)(?<!\bor)\s(?!or\b)(?!and\b)', re.i | re.debug) assert_not -1 literal 8 literal 97 literal 110 literal 100 assert_not -1 literal 8 literal 111 literal 114 in category category_space assert_not 1 literal 111 literal 114 literal 8 assert_not 1 literal 97 literal 110 literal 100 literal 8 >>> spacereplace = re.compile(r'(?<!\band)(?<!\bor)\s(?!or\b)(?!and\b)', re.i | re.debug) assert_not -1 @ at_boundary literal 97 literal 110 literal 100 assert_not -1 @ at_boundary literal 111 literal 114 in category category_space assert_not 1 literal 111 literal 114 @ at_boundary assert_not 1 literal 97 literal 110 literal 100 @ at_boundary
Comments
Post a Comment