Is there a library function that performs binary search on a list/tuple and return the position of the item if found and ‘False’ (-1, None, etc.) if not?
I found the functions bisect_left/right in the bisect module, but they still return a position even if the item is not in the list. That’s perfectly fine for their intended usage, but I just want to know if an item is in the list or not (don’t want to insert anything).
I thought of using
bisect_left and then checking if the item at that position is equal to what I’m searching, but that seems cumbersome (and I also need to do bounds checking if the number can be larger than the largest number in my list). If there is a nicer method I’d like to know about it.
Edit To clarify what I need this for: I’m aware that a dictionary would be very well suited for this, but I’m trying to keep the memory consumption as low as possible. My intended usage would be a sort of double-way look-up table. I have in the table a list of values and I need to be able to access the values based on their index. And also I want to be able to find the index of a particular value or None if the value is not in the list.
Using a dictionary for this would be the fastest way, but would (approximately) double the memory requirements.
I was asking this question thinking that I may have overlooked something in the Python libraries. It seems I’ll have to write my own code, as Moe suggested.
bisect_left finds the first position
p at which an element could be inserted in a given sorted range while maintaining the sorted order. That will be the position of
x exists in the range. If
p is the past-the-end position,
x wasn’t found. Otherwise, we can test to see if
x is there to see if
x was found.
from bisect import bisect_left def binary_search(a, x, lo=0, hi=None): if hi is None: hi = len(a) pos = bisect_left(a, x, lo, hi) # find insertion position return pos if pos != hi and a[pos] == x else -1 # don't walk off the end
Why not look at the code for bisect_left/right and adapt it to suit your purpose.
def binary_search(a, x, lo=0, hi=None): if hi is None: hi = len(a) while lo < hi: mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: return mid return -1
This is a little off-topic (since Moe’s answer seems complete to the OP’s question), but it might be worth looking at the complexity for your whole procedure from end to end. If you’re storing thing in a sorted lists (which is where a binary search would help), and then just checking for existence, you’re incurring (worst-case, unless specified):
- O( n log n) to initially create the list (if it’s unsorted data. O(n), if it’s sorted )
- O( log n) lookups (this is the binary search part)
- O( n ) insert / delete (might be O(1) or O(log n) average case, depending on your pattern)
Whereas with a
set(), you’re incurring
- O(n) to create
- O(1) lookup
- O(1) insert / delete
The thing a sorted list really gets you are “next”, “previous”, and “ranges” (including inserting or deleting ranges), which are O(1) or O(|range|), given a starting index. If you aren’t using those sorts of operations often, then storing as sets, and sorting for display might be a better deal overall.
set() incurs very little additional overhead in python.
It might be worth mentioning that the bisect docs now provide searching examples:
(Raising ValueError instead of returning -1 or None is more pythonic – list.index() does it, for example. But of course you can adapt the examples to your needs.)
Simplest is to use bisect and check one position back to see if the item is there:
def binary_search(a,x,lo=0,hi=-1): i = bisect(a,x,lo,hi) if i == 0: return -1 elif a[i-1] == x: return i-1 else: return -1
This is right from the manual:
8.5.1. Searching Sorted Lists
The above bisect() functions are useful for finding insertion points but can be tricky or awkward to use for common searching tasks. The following five functions show how to transform them into the standard lookups for sorted lists:
def index(a, x): 'Locate the leftmost value exactly equal to x' i = bisect_left(a, x) if i != len(a) and a[i] == x: return i raise ValueError
So with the slight modification your code should be:
def index(a, x): 'Locate the leftmost value exactly equal to x' i = bisect_left(a, x) if i != len(a) and a[i] == x: return i return -1
I agree that @DaveAbrahams’s answer using the bisect module is the correct approach. He did not mention one important detail in his answer.
From the docs
bisect.bisect_left(a, x, lo=0, hi=len(a))
The bisection module does not require the search array to be precomputed ahead of time. You can just present the endpoints to the
bisect.bisect_left instead of it using the defaults of
Even more important for my use, looking for a value X such that the error of a given function is minimized. To do that, I needed a way to have the bisect_left’s algorithm call my computation instead. This is really simple.
Just provide an object that defines
For example, we could use the bisect algorithm to find a square root with arbitrary precision!
import bisect class sqrt_array(object): def __init__(self, digits): self.precision = float(10**(digits)) def __getitem__(self, key): return (key/self.precision)**2.0 sa = sqrt_array(4) # "search" in the range of 0 to 10 with a "precision" of 0.0001 index = bisect.bisect_left(sa, 7, 0, 10*10**4) print 7**0.5 print index/(10**4.0)
This one is:
- not recursive (which makes it more memory-efficient than most recursive approaches)
- actually working
- fast since it runs without any unnecessary if’s and conditions
- based on a mathematical assertion that the floor of (low + high)/2 is always smaller than high where low is the lower limit and high is the upper limit.
def binsearch(t, key, low = 0, high = len(t) - 1): # bisecting the range while low < high: mid = (low + high)//2 if t[mid] < key: low = mid + 1 else: high = mid # at this point 'low' should point at the place # where the value of 'key' is possibly stored. return low if t[low] == key else -1