How to replace multiple substrings of a string?

Posted on

Problem :

I would like to use the .replace function to replace multiple strings.

I currently have

string.replace("condition1", "")

but would like to have something like

string.replace("condition1", "").replace("condition2", "text")

although that does not feel like good syntax

what is the proper way to do this? kind of like how in grep/regex you can do 1 and 2 to replace fields to certain search strings

Solution :

Here is a short example that should do the trick with regular expressions:

import re

rep = {"condition1": "", "condition2": "text"} # define desired replacements here

# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems()) 
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)

For example:

>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'

You could just make a nice little looping function.

def replace_all(text, dic):
    for i, j in dic.iteritems():
        text = text.replace(i, j)
    return text

where text is the complete string and dic is a dictionary — each definition is a string that will replace a match to the term.

Note: in Python 3, iteritems() has been replaced with items()


Careful: Python dictionaries don’t have a reliable order for iteration. This solution only solves your problem if:

  • order of replacements is irrelevant
  • it’s ok for a replacement to change the results of previous replacements

Update: The above statement related to ordering of insertion does not apply to Python versions greater than or equal to 3.6, as standard dicts were changed to use insertion ordering for iteration.

For instance:

d = { "cat": "dog", "dog": "pig"}
my_sentence = "This is my cat and this is my dog."
replace_all(my_sentence, d)
print(my_sentence)

Possible output #1:

"This is my pig and this is my pig."

Possible output #2

"This is my dog and this is my pig."

One possible fix is to use an OrderedDict.

from collections import OrderedDict
def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text
od = OrderedDict([("cat", "dog"), ("dog", "pig")])
my_sentence = "This is my cat and this is my dog."
replace_all(my_sentence, od)
print(my_sentence)

Output:

"This is my pig and this is my pig."

Careful #2: Inefficient if your text string is too big or there are many pairs in the dictionary.

Why not one solution like this?

s = "The quick brown fox jumps over the lazy dog"
for r in (("brown", "red"), ("lazy", "quick")):
    s = s.replace(*r)

#output will be:  The quick red fox jumps over the quick dog

Here is a variant of the first solution using reduce, in case you like being functional. 🙂

repls = {'hello' : 'goodbye', 'world' : 'earth'}
s = 'hello, world'
reduce(lambda a, kv: a.replace(*kv), repls.iteritems(), s)

martineau’s even better version:

repls = ('hello', 'goodbye'), ('world', 'earth')
s = 'hello, world'
reduce(lambda a, kv: a.replace(*kv), repls, s)

This is just a more concise recap of F.J and MiniQuark great answers and last but decisive improvement by bgusach. All you need to achieve multiple simultaneous string replacements is the following function:

def multiple_replace(string, rep_dict):
    pattern = re.compile("|".join([re.escape(k) for k in sorted(rep_dict,key=len,reverse=True)]), flags=re.DOTALL)
    return pattern.sub(lambda x: rep_dict[x.group(0)], string)

Usage:

>>>multiple_replace("Do you like cafe? No, I prefer tea.", {'cafe':'tea', 'tea':'cafe', 'like':'prefer'})
'Do you prefer tea? No, I prefer cafe.'

If you wish, you can make your own dedicated replacement functions starting from this simpler one.

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can apply the replacements within a list comprehension:

# text = "The quick brown fox jumps over the lazy dog"
# replacements = [("brown", "red"), ("lazy", "quick")]
[text := text.replace(a, b) for a, b in replacements]
# text = 'The quick red fox jumps over the quick dog'

I built this upon F.J.s excellent answer:

import re

def multiple_replacer(*key_values):
    replace_dict = dict(key_values)
    replacement_function = lambda match: replace_dict[match.group(0)]
    pattern = re.compile("|".join([re.escape(k) for k, v in key_values]), re.M)
    return lambda string: pattern.sub(replacement_function, string)

def multiple_replace(string, *key_values):
    return multiple_replacer(*key_values)(string)

One shot usage:

>>> replacements = (u"café", u"tea"), (u"tea", u"café"), (u"like", u"love")
>>> print multiple_replace(u"Do you like café? No, I prefer tea.", *replacements)
Do you love tea? No, I prefer café.

Note that since replacement is done in just one pass, “cafĂ©” changes to “tea”, but it does not change back to “cafĂ©”.

If you need to do the same replacement many times, you can create a replacement function easily:

>>> my_escaper = multiple_replacer(('"','\"'), ('t', '\t'))
>>> many_many_strings = (u'This text will be escaped by "my_escaper"',
u'Does this work?tYes it does',
u'And can we spannmultiple lines?t"Yestwetcan!"')
>>> for line in many_many_strings:
... print my_escaper(line)
...
This text will be escaped by "my_escaper"

Leave a Reply

Your email address will not be published.