replacing only single instances of a character with python regexp

Posted on

Question :

replacing only single instances of a character with python regexp

I am trying to replace single $ characters with something else, and want to ignore multiple $ characters in a row, and I can’t quite figure out how. I tried using lookahead:

s='$a $$b $$$c $d'
re.sub('$(?!$)','z',s)

This gives me:

'za $zb $$zc zd'

when what I want is

'za $$b $$$c zd'

What am I doing wrong?

Asked By: Jason S

||

Answer #1:

notes, if not using a callable for the replacement function:

  • you would need look-ahead because you must not match if followed by $
  • you would need look-behind because you must not match if preceded by $

not as elegant but this is very readable:

>>> def dollar_repl(matchobj):
...     val = matchobj.group(0)
...     if val == '$':
...         val = 'z'
...     return val
... 
>>> import re
>>> s = '$a $$b $$$c $d'
>>> re.sub('$+', dollar_repl, s)
'za $$b $$$c zd'
Answered By: dnozay

Answer #2:

Hmm. It looks like I can get it to work if I used both lookahead and lookbehind. Seems like there should be an easier way, though.

>>> re.sub('(?<!$)$(?!$)','z',s)
'za $$b $$$c zd'
Answered By: Jason S

Answer #3:

Ok, without lookaround and without callback function:

re.sub('(^|[^$])$([^$]|$)', '1z2', s)

Answer #4:

An alternative with re.split:

''.join('z' if x == '$' else x for x in re.split('($+)', s))
Answered By: perreal

Leave a Reply

Your email address will not be published. Required fields are marked *