Binary search (bisection) in Python

Posted on

Question :

Binary search (bisection) in Python

Is there a library function that performs binary search on a list/tuple and return the position of the item if found and ‘False’ (-1, None, etc.) if not?

I found the functions bisect_left/right in the bisect module, but they still return a position even if the item is not in the list. That’s perfectly fine for their intended usage, but I just want to know if an item is in the list or not (don’t want to insert anything).

I thought of using bisect_left and then checking if the item at that position is equal to what I’m searching, but that seems cumbersome (and I also need to do bounds checking if the number can be larger than the largest number in my list). If there is a nicer method I’d like to know about it.

Edit To clarify what I need this for: I’m aware that a dictionary would be very well suited for this, but I’m trying to keep the memory consumption as low as possible. My intended usage would be a sort of double-way look-up table. I have in the table a list of values and I need to be able to access the values based on their index. And also I want to be able to find the index of a particular value or None if the value is not in the list.

Using a dictionary for this would be the fastest way, but would (approximately) double the memory requirements.

I was asking this question thinking that I may have overlooked something in the Python libraries. It seems I’ll have to write my own code, as Moe suggested.

Asked By: rslite

||

Answer #1:

bisect_left finds the first position p at which an element could be inserted in a given sorted range while maintaining the sorted order. That will be the position of x if x exists in the range. If p is the past-the-end position, x wasn’t found. Otherwise, we can test to see if x is there to see if x was found.

from bisect import bisect_left

def binary_search(a, x, lo=0, hi=None):
    if hi is None: hi = len(a)
    pos = bisect_left(a, x, lo, hi)                  # find insertion position
    return pos if pos != hi and a[pos] == x else -1  # don't walk off the end
Answered By: Dave Abrahams

Answer #2:

Why not look at the code for bisect_left/right and adapt it to suit your purpose.

like this:

def binary_search(a, x, lo=0, hi=None):
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        midval = a[mid]
        if midval < x:
            lo = mid+1
        elif midval > x: 
            hi = mid
        else:
            return mid
    return -1
Answered By: Moe

Answer #3:

This is a little off-topic (since Moe’s answer seems complete to the OP’s question), but it might be worth looking at the complexity for your whole procedure from end to end. If you’re storing thing in a sorted lists (which is where a binary search would help), and then just checking for existence, you’re incurring (worst-case, unless specified):

Sorted Lists

  • O( n log n) to initially create the list (if it’s unsorted data. O(n), if it’s sorted )
  • O( log n) lookups (this is the binary search part)
  • O( n ) insert / delete (might be O(1) or O(log n) average case, depending on your pattern)

Whereas with a set(), you’re incurring

  • O(n) to create
  • O(1) lookup
  • O(1) insert / delete

The thing a sorted list really gets you are “next”, “previous”, and “ranges” (including inserting or deleting ranges), which are O(1) or O(|range|), given a starting index. If you aren’t using those sorts of operations often, then storing as sets, and sorting for display might be a better deal overall. set() incurs very little additional overhead in python.

Answered By: Gregg Lind

Answer #4:

It might be worth mentioning that the bisect docs now provide searching examples:
http://docs.python.org/library/bisect.html#searching-sorted-lists

(Raising ValueError instead of returning -1 or None is more pythonic – list.index() does it, for example. But of course you can adapt the examples to your needs.)

Answered By: Petr Viktorin

Answer #5:

Simplest is to use bisect and check one position back to see if the item is there:

def binary_search(a,x,lo=0,hi=-1):
    i = bisect(a,x,lo,hi)
    if i == 0:
        return -1
    elif a[i-1] == x:
        return i-1
    else:
        return -1
Answered By: Imran

Answer #6:

This is right from the manual:

http://docs.python.org/2/library/bisect.html

8.5.1. Searching Sorted Lists

The above bisect() functions are useful for finding insertion points but can be tricky or awkward to use for common searching tasks. The following five functions show how to transform them into the standard lookups for sorted lists:

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError

So with the slight modification your code should be:

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    return -1
Answered By: arainchi

Answer #7:

I agree that @DaveAbrahams’s answer using the bisect module is the correct approach. He did not mention one important detail in his answer.

From the docs bisect.bisect_left(a, x, lo=0, hi=len(a))

The bisection module does not require the search array to be precomputed ahead of time. You can just present the endpoints to the bisect.bisect_left instead of it using the defaults of 0 and len(a).

Even more important for my use, looking for a value X such that the error of a given function is minimized. To do that, I needed a way to have the bisect_left’s algorithm call my computation instead. This is really simple.

Just provide an object that defines __getitem__ as a

For example, we could use the bisect algorithm to find a square root with arbitrary precision!

import bisect

class sqrt_array(object):
    def __init__(self, digits):
        self.precision = float(10**(digits))
    def __getitem__(self, key):
        return (key/self.precision)**2.0

sa = sqrt_array(4)

# "search" in the range of 0 to 10 with a "precision" of 0.0001
index = bisect.bisect_left(sa, 7, 0, 10*10**4)
print 7**0.5
print index/(10**4.0)
Answered By: paulluap

Answer #8:

This one is:

  • not recursive (which makes it more memory-efficient than most recursive approaches)
  • actually working
  • fast since it runs without any unnecessary if’s and conditions
  • based on a mathematical assertion that the floor of (low + high)/2 is always smaller than high where low is the lower limit and high is the upper limit.

def binsearch(t, key, low = 0, high = len(t) - 1):
    # bisecting the range
    while low < high:
        mid = (low + high)//2
        if t[mid] < key:
            low = mid + 1
        else:
            high = mid
    # at this point 'low' should point at the place
    # where the value of 'key' is possibly stored.
    return low if t[low] == key else -1
Answered By: Mateusz Piotrowski

Leave a Reply

Your email address will not be published. Required fields are marked *