I’m having trouble parsing HTML elements with “class” attribute using Beautifulsoup. The code looks like this

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs:
    if (div["class"] == "stylelistrow"):
        print div

I get an error on the same line “after” the script finishes.

File "./", line 130, in getlanguage
  if (div["class"] == "stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/", line 599, in __getitem__
   return self._getAttrMap()[key]
KeyError: 'class'

How do I get rid of this error?

Asked By: Neo


Answer #1:

You can refine your search to only find those divs with a given class using BS3:

mydivs = soup.findAll("div", {"class": "stylelistrow"})

Answer #2:

From the documentation:

As of Beautiful Soup 4.1.2, you can search by CSS class using the keyword argument class_:

soup.find_all("a", class_="sister")

Which in this case would be:

soup.find_all("div", class_="stylelistrow")

It would also work for:

soup.find_all("div", class_="stylelistrowone stylelistrowtwo")
Answered By: jmunsch

Answer #3:

Update: 2016
In the latest version of beautifulsoup, the method ‘findAll’ has been renamed to
‘find_all’. Link to official documentation

List of method names changed

Hence the answer will be

soup.find_all("html_element", class_="your_class_name")
Answered By: overlord

Answer #4:

Specific to BeautifulSoup 3:

             {'class': lambda x: x
                       and 'stylelistrow' in x.split()

Will find all of these:

<div class="stylelistrow">
<div class="stylelistrow button">
<div class="button stylelistrow">
Answered By: FlipMcF

Answer #5:

CSS selectors

single class first match


list of matches'.stylelistrow')

compound class (i.e. AND another class)


Spaces in compound class names e.g. class = stylelistrow otherclassname are replaced with “.”. You can continue to add classes.

list of classes (OR – match whichever present

soup.select_one('.stylelistrow, .otherclassname')'.stylelistrow, .otherclassname')

bs4 4.7.1 +

Specific class whose innerText contains a string

soup.select_one('.stylelistrow:contains("some string")')'.stylelistrow:contains("some string")')

Specific class which has a certain child element e.g. a tag

Answered By: QHarr

Answer #6:

A straight forward way would be :

soup = BeautifulSoup(sdata)
for each_div in soup.findAll('div',{'class':'stylelist'}):
    print each_div

Make sure you take of the casing of findAll, its not findall

Answered By: Konark Modi

Answer #7:

I’m having trouble parsing html elements with “class” attribute using Beautifulsoup.

You can easily find by one class, but if you want to find by the intersection of two classes, it’s a little more difficult,

From the documentation (emphasis added):

If you want to search for tags that match two or more CSS classes, you should use a CSS selector:"p.strikeout.body")
# [<p class="body strikeout"></p>]

To be clear, this selects only the p tags that are both strikeout and body class.

To find for the intersection of any in a set of classes (not the intersection, but the union), you can give a list to the class_ keyword argument (as of 4.1.2):

soup = BeautifulSoup(sdata)
class_list = ["stylelistrow"] # can add any other classes to this list.
# will find any divs with any names in class_list:
mydivs = soup.find_all('div', class_=class_list)

Also note that findAll has been renamed from the camelCase to the more Pythonic find_all.

Answered By: Aaron Hall

Answer #8:

As of BeautifulSoup 4+ ,

If you have a single class name , you can just pass the class name as parameter like :

mydivs = soup.find_all('div', 'class_name')

Or if you have more than one class names , just pass the list of class names as parameter like :

mydivs = soup.find_all('div', ['class1', 'class2'])
Answered By: Shivam Shah

