Select an element of a website using BeautifulSoup

1

link

Hello good day, I am trying to extract the information from that page with python and BeautifulSoap, so far I have managed to extract the part of the yellow box below and filter it, but the box above it is impossible to extract the filtered information.

The part in bold if I can extract it, but I do not understand how to remove the element that is below because it has no class label and I do not know how to access it, there are 5 elements and I do not give more. Any ideas?

import urllib.request 
import bs4 as bs

sauce = urllib.request.urlopen(html).read()
soup = bs.BeautifulSoup(sauce,'html.parser')

for negritas in soup.find_all(class_ = "negritas"):
   print (negritas.get_text())
    
asked by Carlos Ruz 01.05.2018 в 01:00
source

2 answers

0

Try nextSibling like this:

for negritas in soup.find_all(class_ = "negritas"):
  print (negritas.get_text())
  print (negritas.nextSibling.get_text())
    
answered by 01.05.2018 / 02:11
source
0

One more question, if I worked the previous answer and quite well but I have a problem, I'm only interested in the elements that have the class _="bold", 6 bold classes with 6 siblings, how I can select the first 6 elements of bold letters and the 6 siblings that correspond to them without the whole page running?

Try this but do not leave me:

soup.find_all ("body", {'class': "bold"}) [2] .getText ()

Thank you.

    
answered by 02.05.2018 в 18:03