Solving problem is about exposing yourself to as many situations as possible like Understanding slice notation and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Understanding slice notation, which can be followed any time. Take easy to follow this discuss.
I need a good explanation (references are a plus) on Python’s slice notation.
To me, this notation needs a bit of picking up.
It looks extremely powerful, but I haven’t quite got my head around it.
Answer #1:
It’s pretty simple really:
a[start:stop] # items start through stop-1
a[start:] # items start through the rest of the array
a[:stop] # items from the beginning through stop-1
a[:] # a copy of the whole array
There is also the step
value, which can be used with any of the above:
a[start:stop:step] # start through not past stop, by step
The key point to remember is that the :stop
value represents the first value that is not in the selected slice. So, the difference between stop
and start
is the number of elements selected (if step
is 1, the default).
The other feature is that start
or stop
may be a negative number, which means it counts from the end of the array instead of the beginning. So:
a[-1] # last item in the array
a[-2:] # last two items in the array
a[:-2] # everything except the last two items
Similarly, step
may be a negative number:
a[::-1] # all items in the array, reversed
a[1::-1] # the first two items, reversed
a[:-3:-1] # the last two items, reversed
a[-3::-1] # everything except the last two items, reversed
Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2]
and a
only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.
Relation to slice()
object
The slicing operator []
is actually being used in the above code with a slice()
object using the :
notation (which is only valid within []
), i.e.:
a[start:stop:step]
is equivalent to:
a[slice(start, stop, step)]
Slice objects also behave slightly differently depending on the number of arguments, similarly to range()
, i.e. both slice(stop)
and slice(start, stop[, step])
are supported.
To skip specifying a given argument, one might use None
, so that e.g. a[start:]
is equivalent to a[slice(start, None)]
or a[::-1]
is equivalent to a[slice(None, None, -1)]
.
While the :
-based notation is very helpful for simple slicing, the explicit use of slice()
objects simplifies the programmatic generation of slicing.
Answer #2:
The Python tutorial talks about it (scroll down a bit until you get to the part about slicing).
The ASCII art diagram is helpful too for remembering how slices work:
+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0 1 2 3 4 5 6
-6 -5 -4 -3 -2 -1
One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n.
Answer #3:
Enumerating the possibilities allowed by the grammar:
>>> seq[:] # [seq[0], seq[1], ..., seq[-1] ]
>>> seq[low:] # [seq[low], seq[low+1], ..., seq[-1] ]
>>> seq[:high] # [seq[0], seq[1], ..., seq[high-1]]
>>> seq[low:high] # [seq[low], seq[low+1], ..., seq[high-1]]
>>> seq[::stride] # [seq[0], seq[stride], ..., seq[-1] ]
>>> seq[low::stride] # [seq[low], seq[low+stride], ..., seq[-1] ]
>>> seq[:high:stride] # [seq[0], seq[stride], ..., seq[high-1]]
>>> seq[low:high:stride] # [seq[low], seq[low+stride], ..., seq[high-1]]
Of course, if (high-low)%stride != 0
, then the end point will be a little lower than high-1
.
If stride
is negative, the ordering is changed a bit since we’re counting down:
>>> seq[::-stride] # [seq[-1], seq[-1-stride], ..., seq[0] ]
>>> seq[high::-stride] # [seq[high], seq[high-stride], ..., seq[0] ]
>>> seq[:low:-stride] # [seq[-1], seq[-1-stride], ..., seq[low+1]]
>>> seq[high:low:-stride] # [seq[high], seq[high-stride], ..., seq[low+1]]
Extended slicing (with commas and ellipses) are mostly used only by special data structures (like NumPy); the basic sequences don’t support them.
>>> class slicee:
... def __getitem__(self, item):
... return repr(item)
...
>>> slicee()[0, 1:2, ::5, ...]
'(0, slice(1, 2, None), slice(None, None, 5), Ellipsis)'
Answer #4:
The answers above don’t discuss slice assignment. To understand slice assignment, it’s helpful to add another concept to the ASCII art:
+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
Slice position: 0 1 2 3 4 5 6
Index position: 0 1 2 3 4 5
>>> p = ['P','y','t','h','o','n']
# Why the two sets of numbers:
# indexing gives items, not lists
>>> p[0]
'P'
>>> p[5]
'n'
# Slicing gives lists
>>> p[0:1]
['P']
>>> p[0:2]
['P','y']
One heuristic is, for a slice from zero to n, think: “zero is the beginning, start at the beginning and take n items in a list”.
>>> p[5] # the last of six items, indexed from zero
'n'
>>> p[0:5] # does NOT include the last item!
['P','y','t','h','o']
>>> p[0:6] # not p[0:5]!!!
['P','y','t','h','o','n']
Another heuristic is, “for any slice, replace the start by zero, apply the previous heuristic to get the end of the list, then count the first number back up to chop items off the beginning”
>>> p[0:4] # Start at the beginning and count out 4 items
['P','y','t','h']
>>> p[1:4] # Take one item off the front
['y','t','h']
>>> p[2:4] # Take two items off the front
['t','h']
# etc.
The first rule of slice assignment is that since slicing returns a list, slice assignment requires a list (or other iterable):
>>> p[2:3]
['t']
>>> p[2:3] = ['T']
>>> p
['P','y','T','h','o','n']
>>> p[2:3] = 't'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only assign an iterable
The second rule of slice assignment, which you can also see above, is that whatever portion of the list is returned by slice indexing, that’s the same portion that is changed by slice assignment:
>>> p[2:4]
['T','h']
>>> p[2:4] = ['t','r']
>>> p
['P','y','t','r','o','n']
The third rule of slice assignment is, the assigned list (iterable) doesn’t have to have the same length; the indexed slice is simply sliced out and replaced en masse by whatever is being assigned:
>>> p = ['P','y','t','h','o','n'] # Start over
>>> p[2:4] = ['s','p','a','m']
>>> p
['P','y','s','p','a','m','o','n']
The trickiest part to get used to is assignment to empty slices. Using heuristic 1 and 2 it’s easy to get your head around indexing an empty slice:
>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
['P','y','t','h']
>>> p[1:4]
['y','t','h']
>>> p[2:4]
['t','h']
>>> p[3:4]
['h']
>>> p[4:4]
[]
And then once you’ve seen that, slice assignment to the empty slice makes sense too:
>>> p = ['P','y','t','h','o','n']
>>> p[2:4] = ['x','y'] # Assigned list is same length as slice
>>> p
['P','y','x','y','o','n'] # Result is same length
>>> p = ['P','y','t','h','o','n']
>>> p[3:4] = ['x','y'] # Assigned list is longer than slice
>>> p
['P','y','t','x','y','o','n'] # The result is longer
>>> p = ['P','y','t','h','o','n']
>>> p[4:4] = ['x','y']
>>> p
['P','y','t','h','x','y','o','n'] # The result is longer still
Note that, since we are not changing the second number of the slice (4), the inserted items always stack right up against the ‘o’, even when we’re assigning to the empty slice. So the position for the empty slice assignment is the logical extension of the positions for the non-empty slice assignments.
Backing up a little bit, what happens when you keep going with our procession of counting up the slice beginning?
>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
['P','y','t','h']
>>> p[1:4]
['y','t','h']
>>> p[2:4]
['t','h']
>>> p[3:4]
['h']
>>> p[4:4]
[]
>>> p[5:4]
[]
>>> p[6:4]
[]
With slicing, once you’re done, you’re done; it doesn’t start slicing backwards. In Python you don’t get negative strides unless you explicitly ask for them by using a negative number.
>>> p[5:3:-1]
['n','o']
There are some weird consequences to the “once you’re done, you’re done” rule:
>>> p[4:4]
[]
>>> p[5:4]
[]
>>> p[6:4]
[]
>>> p[6]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
In fact, compared to indexing, Python slicing is bizarrely error-proof:
>>> p[100:200]
[]
>>> p[int(2e99):int(1e99)]
[]
This can come in handy sometimes, but it can also lead to somewhat strange behavior:
>>> p
['P', 'y', 't', 'h', 'o', 'n']
>>> p[int(2e99):int(1e99)] = ['p','o','w','e','r']
>>> p
['P', 'y', 't', 'h', 'o', 'n', 'p', 'o', 'w', 'e', 'r']
Depending on your application, that might… or might not… be what you were hoping for there!
Below is the text of my original answer. It has been useful to many people, so I didn’t want to delete it.
>>> r=[1,2,3,4]
>>> r[1:1]
[]
>>> r[1:1]=[9,8]
>>> r
[1, 9, 8, 2, 3, 4]
>>> r[1:1]=['blah']
>>> r
[1, 'blah', 9, 8, 2, 3, 4]
This may also clarify the difference between slicing and indexing.
Answer #5:
Explain Python’s slice notation
In short, the colons (:
) in subscript notation (subscriptable[subscriptarg]
) make slice notation – which has the optional arguments, start
, stop
, step
:
sliceable[start:stop:step]
Python slicing is a computationally fast way to methodically access parts of your data. In my opinion, to be even an intermediate Python programmer, it’s one aspect of the language that it is necessary to be familiar with.
Important Definitions
To begin with, let’s define a few terms:
start: the beginning index of the slice, it will include the element at this index unless it is the same as stop, defaults to 0, i.e. the first index. If it’s negative, it means to start
n
items from the end.stop: the ending index of the slice, it does not include the element at this index, defaults to length of the sequence being sliced, that is, up to and including the end.
step: the amount by which the index increases, defaults to 1. If it’s negative, you’re slicing over the iterable in reverse.
How Indexing Works
You can make any of these positive or negative numbers. The meaning of the positive numbers is straightforward, but for negative numbers, just like indexes in Python, you count backwards from the end for the start and stop, and for the step, you simply decrement your index. This example is from the documentation’s tutorial, but I’ve modified it slightly to indicate which item in a sequence each index references:
+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0 1 2 3 4 5
-6 -5 -4 -3 -2 -1
How Slicing Works
To use slice notation with a sequence that supports it, you must include at least one colon in the square brackets that follow the sequence (which actually implement the __getitem__
method of the sequence, according to the Python data model.)
Slice notation works like this:
sequence[start:stop:step]
And recall that there are defaults for start, stop, and step, so to access the defaults, simply leave out the argument.
Slice notation to get the last nine elements from a list (or any other sequence that supports it, like a string) would look like this:
my_list[-9:]
When I see this, I read the part in the brackets as “9th from the end, to the end.” (Actually, I abbreviate it mentally as “-9, on”)
Explanation:
The full notation is
my_list[-9:None:None]
and to substitute the defaults (actually when step
is negative, stop
‘s default is -len(my_list) - 1
, so None
for stop really just means it goes to whichever end step takes it to):
my_list[-9:len(my_list):1]
The colon, :
, is what tells Python you’re giving it a slice and not a regular index. That’s why the idiomatic way of making a shallow copy of lists in Python 2 is
list_copy = sequence[:]
And clearing them is with:
del my_list[:]
(Python 3 gets a list.copy
and list.clear
method.)
When step
is negative, the defaults for start
and stop
change
By default, when the step
argument is empty (or None
), it is assigned to +1
.
But you can pass in a negative integer, and the list (or most other standard slicables) will be sliced from the end to the beginning.
Thus a negative slice will change the defaults for start
and stop
!
Confirming this in the source
I like to encourage users to read the source as well as the documentation. The source code for slice objects and this logic is found here. First we determine if step
is negative:
step_is_negative = step_sign < 0;
If so, the lower bound is -1
meaning we slice all the way up to and including the beginning, and the upper bound is the length minus 1, meaning we start at the end. (Note that the semantics of this -1
is different from a -1
that users may pass indexes in Python indicating the last item.)
if (step_is_negative) {
lower = PyLong_FromLong(-1L);
if (lower == NULL)
goto error;
upper = PyNumber_Add(length, lower);
if (upper == NULL)
goto error;
}
Otherwise step
is positive, and the lower bound will be zero and the upper bound (which we go up to but not including) the length of the sliced list.
else {
lower = _PyLong_Zero;
Py_INCREF(lower);
upper = length;
Py_INCREF(upper);
}
Then, we may need to apply the defaults for start
and stop
– the default then for start
is calculated as the upper bound when step
is negative:
if (self->start == Py_None) {
start = step_is_negative ? upper : lower;
Py_INCREF(start);
}
and stop
, the lower bound:
if (self->stop == Py_None) {
stop = step_is_negative ? lower : upper;
Py_INCREF(stop);
}
Give your slices a descriptive name!
You may find it useful to separate forming the slice from passing it to the list.__getitem__
method (that’s what the square brackets do). Even if you’re not new to it, it keeps your code more readable so that others that may have to read your code can more readily understand what you’re doing.
However, you can’t just assign some integers separated by colons to a variable. You need to use the slice object:
last_nine_slice = slice(-9, None)
The second argument, None
, is required, so that the first argument is interpreted as the start
argument otherwise it would be the stop
argument.
You can then pass the slice object to your sequence:
>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]
It’s interesting that ranges also take slices:
>>> range(100)[last_nine_slice]
range(91, 100)
Memory Considerations:
Since slices of Python lists create new objects in memory, another important function to be aware of is itertools.islice
. Typically you’ll want to iterate over a slice, not just have it created statically in memory. islice
is perfect for this. A caveat, it doesn’t support negative arguments to start
, stop
, or step
, so if that’s an issue you may need to calculate indices or reverse the iterable in advance.
length = 100
last_nine_iter = itertools.islice(list(range(length)), length-9, None, 1)
list_last_nine = list(last_nine_iter)
and now:
>>> list_last_nine
[91, 92, 93, 94, 95, 96, 97, 98, 99]
The fact that list slices make a copy is a feature of lists themselves. If you’re slicing advanced objects like a Pandas DataFrame, it may return a view on the original, and not a copy.
Answer #6:
And a couple of things that weren’t immediately obvious to me when I first saw the slicing syntax:
>>> x = [1,2,3,4,5,6]
>>> x[::-1]
[6,5,4,3,2,1]
Easy way to reverse sequences!
And if you wanted, for some reason, every second item in the reversed sequence:
>>> x = [1,2,3,4,5,6]
>>> x[::-2]
[6,4,2]
Answer #7:
In Python 2.7
Slicing in Python
[a:b:c]
len = length of string, tuple or list
c -- default is +1. The sign of c indicates forward or backward, absolute value of c indicates steps. Default is forward with step size 1. Positive means forward, negative means backward.
a -- When c is positive or blank, default is 0. When c is negative, default is -1.
b -- When c is positive or blank, default is len. When c is negative, default is -(len+1).
Understanding index assignment is very important.
In forward direction, starts at 0 and ends at len-1
In backward direction, starts at -1 and ends at -len
When you say [a:b:c], you are saying depending on the sign of c (forward or backward), start at a and end at b (excluding element at bth index). Use the indexing rule above and remember you will only find elements in this range:
-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1
But this range continues in both directions infinitely:
...,-len -2 ,-len-1,-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1, len, len +1, len+2 , ....
For example:
0 1 2 3 4 5 6 7 8 9 10 11
a s t r i n g
-9 -8 -7 -6 -5 -4 -3 -2 -1
If your choice of a, b, and c allows overlap with the range above as you traverse using rules for a,b,c above you will either get a list with elements (touched during traversal) or you will get an empty list.
One last thing: if a and b are equal, then also you get an empty list:
>>> l1
[2, 3, 4]
>>> l1[:]
[2, 3, 4]
>>> l1[::-1] # a default is -1 , b default is -(len+1)
[4, 3, 2]
>>> l1[:-4:-1] # a default is -1
[4, 3, 2]
>>> l1[:-3:-1] # a default is -1
[4, 3]
>>> l1[::] # c default is +1, so a default is 0, b default is len
[2, 3, 4]
>>> l1[::-1] # c is -1 , so a default is -1 and b default is -(len+1)
[4, 3, 2]
>>> l1[-100:-200:-1] # Interesting
[]
>>> l1[-1:-200:-1] # Interesting
[4, 3, 2]
>>> l1[-1:-1:1]
[]
>>> l1[-1:5:1] # Interesting
[4]
>>> l1[1:-7:1]
[]
>>> l1[1:-7:-1] # Interesting
[3, 2]
>>> l1[:-2:-2] # a default is -1, stop(b) at -2 , step(c) by 2 in reverse direction
[4]
Answer #8:
Found this great table at http://wiki.python.org/moin/MovingToPythonFromOtherLanguages
Python indexes and slices for a six-element list.
Indexes enumerate the elements, slices enumerate the spaces between the elements.
Index from rear: -6 -5 -4 -3 -2 -1 a=[0,1,2,3,4,5] a[1:]==[1,2,3,4,5]
Index from front: 0 1 2 3 4 5 len(a)==6 a[:5]==[0,1,2,3,4]
+---+---+---+---+---+---+ a[0]==0 a[:-2]==[0,1,2,3]
| a | b | c | d | e | f | a[5]==5 a[1:2]==[1]
+---+---+---+---+---+---+ a[-1]==5 a[1:-1]==[1,2,3,4]
Slice from front: : 1 2 3 4 5 : a[-2]==4
Slice from rear: : -5 -4 -3 -2 -1 :
b=a[:]
b==[0,1,2,3,4,5] (shallow copy of a)