How do I append one string to another in Python?

Posted on

Question :

How do I append one string to another in Python?

I want an efficient way to append one string to another in Python, other than the following.

var1 = "foo"
var2 = "bar"
var3 = var1 + var2

Is there any good built-in method to use?

Answer #1:

If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.

The end result is that the operation is amortized O(n).

e.g.

s = ""
for i in range(n):
    s+=str(i)

used to be O(n^2), but now it is O(n).

From the source (bytesobject.c):

It’s easy enough to verify empirically.

$ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
1000000 loops, best of 3: 1.85 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
10000 loops, best of 3: 16.8 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
10000 loops, best of 3: 158 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
1000 loops, best of 3: 1.71 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 14.6 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
10 loops, best of 3: 173 msec per loop

It’s important however to note that this optimisation isn’t part of the Python spec. It’s only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance .

$ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
10000 loops, best of 3: 90.8 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
1000 loops, best of 3: 896 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
100 loops, best of 3: 9.03 msec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
10 loops, best of 3: 89.5 msec per loop

So far so good, but then,

$ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 12.8 sec per loop

ouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings.

Answered By: John La Rooy

Answer #2:

Don’t prematurely optimize. If you have no reason to believe there’s a speed bottleneck caused by string concatenations then just stick with + and +=:

s  = 'foo'
s += 'bar'
s += 'baz'

That said, if you’re aiming for something like Java’s StringBuilder, the canonical Python idiom is to add items to a list and then use str.join to concatenate them all at the end:

l = []
l.append('foo')
l.append('bar')
l.append('baz')

s = ''.join(l)
Answered By: John Kugelman

Answer #3:

str1 = "Hello"
str2 = "World"
newstr = " ".join((str1, str2))

That joins str1 and str2 with a space as separators. You can also do "".join(str1, str2, ...). str.join() takes an iterable, so you’d have to put the strings in a list or a tuple.

That’s about as efficient as it gets for a builtin method.

Answered By: Rafe Kettler

Answer #4:

Don’t.

That is, for most cases you are better off generating the whole string in one go rather then appending to an existing string.

For example, don’t do: obj1.name + ":" + str(obj1.count)

Instead: use "%s:%d" % (obj1.name, obj1.count)

That will be easier to read and more efficient.

Answered By: Winston Ewert

Answer #5:

Python 3.6 gives us f-strings, which are a delight:

var1 = "foo"
var2 = "bar"
var3 = f"{var1}{var2}"
print(var3)                       # prints foobar

You can do most anything inside the curly braces

print(f"1 + 1 == {1 + 1}")        # prints 1 + 1 == 2
Answered By: Trenton

Answer #6:

If you need to do many append operations to build a large string, you can use StringIO or cStringIO. The interface is like a file. ie: you write to append text to it.

If you’re just appending two strings then just use +.

Answered By: Laurence Gonsalves

Answer #7:

it really depends on your application. If you’re looping through hundreds of words and want to append them all into a list, .join() is better. But if you’re putting together a long sentence, you’re better off using +=.

Answered By: Ramy

Answer #8:

Basically, no difference. The only consistent trend is that Python seems to be getting slower with every version… 🙁


List

%%timeit
x = []
for i in range(100000000):  # xrange on Python 2.7
    x.append('a')
x = ''.join(x)

Python 2.7

1 loop, best of 3: 7.34 s per loop

Python 3.4

1 loop, best of 3: 7.99 s per loop

Python 3.5

1 loop, best of 3: 8.48 s per loop

Python 3.6

1 loop, best of 3: 9.93 s per loop


String

%%timeit
x = ''
for i in range(100000000):  # xrange on Python 2.7
    x += 'a'

Python 2.7:

1 loop, best of 3: 7.41 s per loop

Python 3.4

1 loop, best of 3: 9.08 s per loop

Python 3.5

1 loop, best of 3: 8.82 s per loop

Python 3.6

1 loop, best of 3: 9.24 s per loop

Answered By: ostrokach

Leave a Reply

Your email address will not be published. Required fields are marked *