Objective:
I'm trying to show a list about the specific names of the website https://www.screwfix.com/c/tools/angle-grinders/cat830694
.
For example:
Get Titan TTB281GRD
of the title of the link in this part:
<div id="product_box_14" class="lg-12 md-24 sm-24 cols">
<div id="productID_93905" class="lii lii--j2 lii__offer">
<div class="lii_head">
<h3 class="lii__title">
<a id="product_description_14" href="https://www.screwfix.com/p/titan-ttb281grd-750w-4-angle-grinder-230-240v/93905" descriptionproductid="93905" title='Titan TTB281GRD 750W 4½" Angle Grinder 230-240V'>
Titan TTB281GRD 750W 4½" Angle Grinder 230-240V
</a>
<span id="product_quoteNo_14" quotenumberproductid="93905">
(93905)
</span>
</h3>
</div>
</div>
</div>
and get Makita DGA456Z
of this analogous part:
<div id="product_box_1" class="lg-12 md-24 sm-24 cols">
<div id="productID_2906R" class="lii lii--j2 lii__offer">
<div class="lii_head">
<h3 class="lii__title">
<a id="product_description_1" href="https://www.screwfix.com/p/makita-dga456z-18v-li-ion-4-brushless-cordless-angle-grinder-bare/2906r" descriptionproductid="2906R" title='Makita DGA456Z 18V Li-Ion 4½" Brushless Cordless Angle Grinder - Bare'>
Makita DGA456Z 18V Li-Ion 4½" Brushless Cordless Angle Grinder - Bare
</a>
<span id="product_quoteNo_1" quotenumberproductid="2906R">
(2906R)
</span>
</h3>
</div>
</div>
</div>
Description:
You should get the values in the variable "título"
( class = "lii_head"
class = "lii__title"
and then within the variable "title ="
)
Code: My program downloads the HTML correctly, and I manage to filter well the parts that I want to take, but when it comes to wanting to get the " title " it returns an empty list.
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests
URL = "https://www.screwfix.com/c/tools/angle-grinders/cat830694"
# Realizamos la petición a la web
req = requests.get(URL)
# Comprobamos que la petición nos devuelve un Status Code = 200
status_code = req.status_code
if status_code == 200:
# Pasamos el contenido HTML de la web a un objeto BeautifulSoup()
html = BeautifulSoup(req.text, "html.parser")
#print html
# Obtenemos todos los divs donde están las entradas
entradas = html.find_all('h3', {'class': 'lii__title'})
#print entradas
# Recorremos todas las entradas para extraer el título, autor y fecha
for i, entrada in enumerate(entradas):
print entrada
# Con el método "getText()" no nos devuelve el HTML
titulo = entrada.find_all('a', {'title'})
# Imprimo el Título, Autor y Fecha de las entradas
print (i + 1, titulo)
else:
print "Status Code %d" % status_code