Iterate over the current selector with XPath and Scrapy

Question

Iterate over the current selector with XPath and Scrapy

Navigation

#1 by (1 votes)

0

I am scribbling information but I am stuck in the for loop. First, when I receive the answer, I create an xpath "root" to call it. This xpath root has one and every one of the items in a tag ul Then I pull out each of the data I need, but from the xpath root. This is the code

import Scrapy

class SecccionAmarillaSpider( scrapy.spider ):
 name="seccion_amarilla"
 start_urls = ['https://www.seccionamarilla.com.mx/resultados/hospitales/1']
 data = {}

  def parse( self, result ):
    selectors = result.xpath('//ul[@class="list"]/li')
    for selector in selectors:
     name = selector.xpath('//span[itemprop="name"]/text()').extract()
     phone = selector.xpath('//span[itemprop="telephone"]/text()').extract()
     #These data is stored in MySQL

Annex understanding image This is my logic. In the for loop, a variable called selector is passed, which is the current index of the list, therefore of this and only of this selector I can get the information I need, as name y phone , that is, I hope that the variables are filled in so that you can build an SQL

name = 'Z'
phone = '( 81)8333 4662'
sql = "INSERT INTO TABLE VALUES('"+name+"','"+phone+"')"

Instead I am receiving all the information in one array, as if I were not respecting the current selector and I would scrape the whole page again, this is what I am getting

name = ['Z','Z-GAS']
phone = ['( 81)8333 4662', '(33)3668 3800']

Why?

I can not make a correlation between fixes, I also thought about it, cyclist name and match name and phone indexes, however the information is very variable, it may or may not have name or phone, then it would not match the relationship of the information.

Any suggestions? Use python 3.x and Scrapy 1.5

python python-3.x scrapy

asked by Alberto Siurob 28.09.2018 в 19:51

source

1 answer

Compare two DataGridViews in C # I do not receive the data sent from javascript (fetch) to php

score 1 · Accepted Answer

In these cases that you want to go through a selector you should remember that to be relative you can not start with /, you will have to start with a point.

It should work if you change your code by:

selectors = result.xpath('//ul[@class="list"]/li')
for selector in selectors:
  name = selector.xpath('.//span[itemprop="name"]/text()').extract()
  phone = selector.xpath('.//span[itemprop="telephone"]/text()').extract()