How to find the cumulative sum of numbers in a list?

Posted on

Solving problem is about exposing yourself to as many situations as possible like How to find the cumulative sum of numbers in a list? and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about How to find the cumulative sum of numbers in a list?, which can be followed any time. Take easy to follow this discuss.

How to find the cumulative sum of numbers in a list?
time_interval = [4, 6, 12]

I want to sum up the numbers like [4, 4+6, 4+6+12] in order to get the list t = [4, 10, 22].

I tried the following:

t1 = time_interval[0]
t2 = time_interval[1] + t1
t3 = time_interval[2] + t2
print(t1, t2, t3)  # -> 4 10 22

Answer #1:

If you’re doing much numerical work with arrays like this, I’d suggest numpy, which comes with a cumulative sum function cumsum:

import numpy as np
a = [4,6,12]
np.cumsum(a)
#array([4, 10, 22])

Numpy is often faster than pure python for this kind of thing, see in comparison to @Ashwini’s accumu:

In [136]: timeit list(accumu(range(1000)))
10000 loops, best of 3: 161 us per loop
In [137]: timeit list(accumu(xrange(1000)))
10000 loops, best of 3: 147 us per loop
In [138]: timeit np.cumsum(np.arange(1000))
100000 loops, best of 3: 10.1 us per loop

But of course if it’s the only place you’ll use numpy, it might not be worth having a dependence on it.

Answered By: askewchan

Answer #2:

In Python 2 you can define your own generator function like this:

def accumu(lis):
    total = 0
    for x in lis:
        total += x
        yield total
In [4]: list(accumu([4,6,12]))
Out[4]: [4, 10, 22]

And in Python 3.2+ you can use itertools.accumulate():

In [1]: lis = [4,6,12]
In [2]: from itertools import accumulate
In [3]: list(accumulate(lis))
Out[3]: [4, 10, 22]
Answered By: Ashwini Chaudhary

Answer #3:

I did a bench-mark of the top two answers with Python 3.4 and I found itertools.accumulate is faster than numpy.cumsum under many circumstances, often much faster. However, as you can see from the comments, this may not always be the case, and it’s difficult to exhaustively explore all options. (Feel free to add a comment or edit this post if you have further benchmark results of interest.)

Some timings…

For short lists accumulate is about 4 times faster:

from timeit import timeit
def sum1(l):
    from itertools import accumulate
    return list(accumulate(l))
def sum2(l):
    from numpy import cumsum
    return list(cumsum(l))
l = [1, 2, 3, 4, 5]
timeit(lambda: sum1(l), number=100000)
# 0.4243644131347537
timeit(lambda: sum2(l), number=100000)
# 1.7077815784141421

For longer lists accumulate is about 3 times faster:

l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.174508565105498
timeit(lambda: sum2(l), number=100000)
# 61.871223849244416

If the numpy array is not cast to list, accumulate is still about 2 times faster:

from timeit import timeit
def sum1(l):
    from itertools import accumulate
    return list(accumulate(l))
def sum2(l):
    from numpy import cumsum
    return cumsum(l)
l = [1, 2, 3, 4, 5]*1000
print(timeit(lambda: sum1(l), number=100000))
# 19.18597290944308
print(timeit(lambda: sum2(l), number=100000))
# 37.759664884768426

If you put the imports outside of the two functions and still return a numpy array, accumulate is still nearly 2 times faster:

from timeit import timeit
from itertools import accumulate
from numpy import cumsum
def sum1(l):
    return list(accumulate(l))
def sum2(l):
    return cumsum(l)
l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.042188624851406
timeit(lambda: sum2(l), number=100000)
# 35.17324400227517
Answered By: Chris_Rands

Answer #4:

Behold:

a = [4, 6, 12]
reduce(lambda c, x: c + [c[-1] + x], a, [0])[1:]

Will output (as expected):

[4, 10, 22]
Answered By: wonder.mice

Answer #5:

Try this:
accumulate function, along with operator add performs the running addition.

import itertools
import operator
result = itertools.accumulate([1,2,3,4,5], operator.add)
list(result)
Answered By: Vani S

Answer #6:

Assignment expressions from PEP 572 (new in Python 3.8) offer yet another way to solve this:

time_interval = [4, 6, 12]
total_time = 0
cum_time = [total_time := total_time + t for t in time_interval]
Answered By: Steven Rumbalski

Answer #7:

You can calculate the cumulative sum list in linear time with a simple for loop:

def csum(lst):
    s = lst.copy()
    for i in range(1, len(s)):
        s[i] += s[i-1]
    return s
time_interval = [4, 6, 12]
print(csum(time_interval))  # [4, 10, 22]

The standard library’s itertools.accumulate may be a faster alternative (since it’s implemented in C):

from itertools import accumulate
time_interval = [4, 6, 12]
print(list(accumulate(time_interval)))  # [4, 10, 22]
Answered By: Eugene Yarmash

Answer #8:

If You want a pythonic way without numpy working in 2.7 this would be my way of doing it

l = [1,2,3,4]
_d={-1:0}
cumsum=[_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]

now let’s try it and test it against all other implementations

import timeit, sys
L=list(range(10000))
if sys.version_info >= (3, 0):
    reduce = functools.reduce
    xrange = range
def sum1(l):
    cumsum=[]
    total = 0
    for v in l:
        total += v
        cumsum.append(total)
    return cumsum
def sum2(l):
    import numpy as np
    return list(np.cumsum(l))
def sum3(l):
    return [sum(l[:i+1]) for i in xrange(len(l))]
def sum4(l):
    return reduce(lambda c, x: c + [c[-1] + x], l, [0])[1:]
def this_implementation(l):
    _d={-1:0}
    return [_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]
# sanity check
sum1(L)==sum2(L)==sum3(L)==sum4(L)==this_implementation(L)
>>> True
# PERFORMANCE TEST
timeit.timeit('sum1(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.001018061637878418
timeit.timeit('sum2(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.000829620361328125
timeit.timeit('sum3(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.4606760001182556
timeit.timeit('sum4(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.18932826995849608
timeit.timeit('this_implementation(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.002348129749298096
Answered By: Cobry

Leave a Reply

Your email address will not be published. Required fields are marked *