I’m trying to identify if a large list has consecutive elements that are the same.
So let’s say:
lst = [1, 2, 3, 4, 5, 5, 6]
And in this case, I would return true, since there are two consecutive elements
lst, are the same value.
I know this could probably be done with some sort of combination of loops, but I was wondering if there were a more efficient way to do this?
You can use
itertools.groupby() and a generator expression within
from itertools import groupby any(sum(1 for _ in g) > 1 for _, g in groupby(lst)) True
Or as a more Pythonic way you can use
zip(), in order to check if at least there are two equal consecutive items in your list:
any(i==j for i,j in zip(lst, lst[1:])) # In python-2.x,in order to avoid creating a 'list' of all pairs instead of an iterator use itertools.izip() True
Note: The first approach is good when you want to check if there are more than 2 consecutive equal items, otherwise, in this case the second one takes the cake!
sum(1 for _ in g) instead of
len(list(g)) is very optimized in terms of memory use (not reading the whole list in memory at once) but the latter is slightly faster.
You can use a simple
lst = [1, 2, 3, 4, 5, 5, 6] any(lst[i]==lst[i+1] for i in range(len(lst)-1)) #outputs: True
True if any of the iterable elements are
If you’re looking for an efficient way of doing this and the lists are numerical, you would probably want to use
numpy and apply the
diff (difference) function:
1,2,3,4,5,5,6]) array([1, 1, 1, 1, 0, 1])numpy.diff([
Then to get a single result regarding whether there are any consecutive elements:
This first performs the
diff, inverts the answer, and then checks if
any of the resulting elements are non-zero.
0 in numpy.diff([1, 2, 3, 4, 5, 5, 6])
also works well and is similar in speed to the
np.any approach (credit for this last version to heracho).
Here a more general
number = 7 n_consecutive = 3 arr = np.array([3, 3, 6, 5, 8, 7, 7, 7, 4, 5]) # ^ ^ ^ np.any(np.convolve(arr == number, v=np.ones(n_consecutive), mode='valid') == n_consecutive)
This method always searches the whole array, while the approach from @Kasramvd ends when the condition is first met. So which method is faster dependents on how sparse those cases of consecutive numbers are.
If you are interested in the positions of the consecutive numbers, and have to look at all elements of the array this approach should be faster (for larger arrays (or/and longer sequences)).
idx = np.nonzero(np.convolve(arr==number, v=np.ones(n_consecutive), mode='valid') == n_consecutive) # idx = i: all(arr[i:i+n_consecutive] == number)
If you are not interested in a specific value but at all consecutive numbers in general a slight variation of @jmetz‘s answer:
np.any(np.convolve(np.diff(arr), v=np.ones(n_consecutive-1), mode='valid') == 0)
for loop should do it:
def check(lst): last = lst for num in lst[1:]: if num == last: return True last = num return False lst = [1, 2, 3, 4, 5, 5, 6] print (check(lst)) #Prints True
Here, in each loop, I check if the current element is equal to the previous element.
from itertools import pairwise any(x == y for (x, y) in pairwise([1, 2, 3, 4, 5, 5, 6])) # True
The intermediate result of
pairwise([1, 2, 3, 4, 5, 5, 6]) # [(1, 2), (2, 3), (3, 4), (4, 5), (5, 5), (5, 6)]
My solution for this if you want to find out whether 3 consecutive values are equal to 7. For example, a tuple of intList = (7, 7, 7, 8, 9, 1):
for i in range(len(intList) - 1): if intList[i] == 7 and intList[i + 2] == 7 and intList[i + 1] == 7: return True return False