Question :
replacing only single instances of a character with python regexp
I am trying to replace single $
characters with something else, and want to ignore multiple $
characters in a row, and I can’t quite figure out how. I tried using lookahead:
s='$a $$b $$$c $d'
re.sub('$(?!$)','z',s)
This gives me:
'za $zb $$zc zd'
when what I want is
'za $$b $$$c zd'
What am I doing wrong?
Answer #1:
notes, if not using a callable for the replacement function:
- you would need look-ahead because you must not match if followed by
$
- you would need look-behind because you must not match if preceded by
$
not as elegant but this is very readable:
>>> def dollar_repl(matchobj):
... val = matchobj.group(0)
... if val == '$':
... val = 'z'
... return val
...
>>> import re
>>> s = '$a $$b $$$c $d'
>>> re.sub('$+', dollar_repl, s)
'za $$b $$$c zd'
Answer #2:
Hmm. It looks like I can get it to work if I used both lookahead and lookbehind. Seems like there should be an easier way, though.
>>> re.sub('(?<!$)$(?!$)','z',s)
'za $$b $$$c zd'
Answer #3:
Ok, without lookaround and without callback function:
re.sub('(^|[^$])$([^$]|$)', '1z2', s)
Answer #4:
An alternative with re.split
:
''.join('z' if x == '$' else x for x in re.split('($+)', s))