Python – Check If Word Is In A String

Posted on

Question :

Python – Check If Word Is In A String

I’m working with Python v2, and I’m trying to find out if you can tell if a word is in a string.

I have found some information about identifying if the word is in the string – using .find, but is there a way to do an IF statement. I would like to have something like the following:

if string.find(word):
    print 'success'

Thanks for any help.

Asked By: The Woo


Answer #1:

What is wrong with:

if word in mystring: 
   print 'success'
Answered By: fabrizioM

Answer #2:

if 'seek' in 'those who seek shall find':

but keep in mind that this matches a sequence of characters, not necessarily a whole word – for example, 'word' in 'swordsmith' is True. If you only want to match whole words, you ought to use regular expressions:

import re

def findWholeWord(w):
    return re.compile(r'b({0})b'.format(w), flags=re.IGNORECASE).search

findWholeWord('seek')('those who seek shall find')    # -> <match object>
findWholeWord('word')('swordsmith')                   # -> None
Answered By: Hugh Bothwell

Answer #3:

If you want to find out whether a whole word is in a space-separated list of words, simply use:

def contains_word(s, w):
    return (' ' + w + ' ') in (' ' + s + ' ')

contains_word('the quick brown fox', 'brown')  # True
contains_word('the quick brown fox', 'row')    # False

This elegant method is also the fastest. Compared to Hugh Bothwell’s and daSong’s approaches:

>python -m timeit -s "def contains_word(s, w): return (' ' + w + ' ') in (' ' + s + ' ')" "contains_word('the quick brown fox', 'brown')"
1000000 loops, best of 3: 0.351 usec per loop

>python -m timeit -s "import re" -s "def contains_word(s, w): return re.compile(r'b({0})b'.format(w), flags=re.IGNORECASE).search(s)" "contains_word('the quick brown fox', 'brown')"
100000 loops, best of 3: 2.38 usec per loop

>python -m timeit -s "def contains_word(s, w): return s.startswith(w + ' ') or s.endswith(' ' + w) or s.find(' ' + w + ' ') != -1" "contains_word('the quick brown fox', 'brown')"
1000000 loops, best of 3: 1.13 usec per loop

Edit: A slight variant on this idea for Python 3.6+, equally fast:

def contains_word(s, w):
    return f' {w} ' in f' {s} '
Answered By: user200783

Answer #4:

find returns an integer representing the index of where the search item was found. If it isn’t found, it returns -1.

haystack = 'asdf'

haystack.find('a') # result: 0
haystack.find('s') # result: 1
haystack.find('g') # result: -1

if haystack.find(needle) >= 0:
  print 'Needle found.'
  print 'Needle not found.'
Answered By: Matt Howell

Answer #5:

You can split string to the words and check the result list.

if word in string.split():
    print 'success'
Answered By: Corvax

Answer #6:

This small function compares all search words in given text. If all search words are found in text, returns length of search, or False otherwise.

Also supports unicode string search.

def find_words(text, search):
    """Find exact words"""
    dText   = text.split()
    dSearch = search.split()

    found_word = 0

    for text_word in dText:
        for search_word in dSearch:
            if search_word == text_word:
                found_word += 1

    if found_word == len(dSearch):
        return lenSearch
        return False


find_words('çelik güray ankara', 'güray ankara')
Answered By: Guray Celik

Answer #7:

If matching a sequence of characters is not sufficient and you need to match whole words, here is a simple function that gets the job done. It basically appends spaces where necessary and searches for that in the string:

def smart_find(haystack, needle):
    if haystack.startswith(needle+" "):
        return True
    if haystack.endswith(" "+needle):
        return True
    if haystack.find(" "+needle+" ") != -1:
        return True
    return False

This assumes that commas and other punctuations have already been stripped out.

Answered By: daSong

Answer #8:

Using regex is a solution, but it is too complicated for that case.

You can simply split text into list of words. Use split(separator, num) method for that. It returns a list of all the words in the string, using separator as the separator. If separator is unspecified it splits on all whitespace (optionally you can limit the number of splits to num).

list_of_words = mystring.split()
if word in list_of_words:
    print 'success'

This will not work for string with commas etc. For example:

mystring = "One,two and three"
# will split into ["One,two", "and", "three"]

If you also want to split on all commas etc. use separator argument like this:

# whitespace_chars = " tnrf" - space, tab, newline, return, formfeed
list_of_words = mystring.split( tnrf,.;!?'"()"")

Leave a Reply

Your email address will not be published. Required fields are marked *