I am scribbling information but I am stuck in the for loop.
First, when I receive the answer, I create an xpath "root" to call it. This xpath root has one and every one of the items in a tag ul
Then I pull out each of the data I need, but from the xpath root. This is the code
import Scrapy
class SecccionAmarillaSpider( scrapy.spider ):
name="seccion_amarilla"
start_urls = ['https://www.seccionamarilla.com.mx/resultados/hospitales/1']
data = {}
def parse( self, result ):
selectors = result.xpath('//ul[@class="list"]/li')
for selector in selectors:
name = selector.xpath('//span[itemprop="name"]/text()').extract()
phone = selector.xpath('//span[itemprop="telephone"]/text()').extract()
#These data is stored in MySQL
Annex understanding image
This is my logic. In the for loop, a variable called selector
is passed, which is the current index of the list, therefore of this and only of this selector I can get the information I need, as name y phone
, that is, I hope that the variables are filled in so that you can build an SQL
name = 'Z'
phone = '( 81)8333 4662'
sql = "INSERT INTO TABLE VALUES('"+name+"','"+phone+"')"
Instead I am receiving all the information in one array, as if I were not respecting the current selector and I would scrape the whole page again, this is what I am getting
name = ['Z','Z-GAS']
phone = ['( 81)8333 4662', '(33)3668 3800']
Why?
I can not make a correlation between fixes, I also thought about it, cyclist name and match name and phone indexes, however the information is very variable, it may or may not have name or phone, then it would not match the relationship of the information.
Any suggestions? Use python 3.x and Scrapy 1.5