Scrapy can not find my spider

Question

Scrapy can not find my spider

Navigation

#1 by (0 votes)

-1

I have problems with Scrapy, for some reason when I run the project does not find my spider, but as much as I look at the code I can not find why. I had already used a similar code in another version of the code and found the spider.

I leave the error you give me when I run it:

gonzalo@gonzalo-pc:~/Desktop/lanacion/datos$ scrapy crawl datos -t csv
2018-10-29 14:31:01 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: datos)
2018-10-29 14:31:01 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0, Twisted 18.9.0, Python 3.6.7rc1 (default, Sep 27 2018, 09:51:25) - [GCC 8.2.0], pyOpenSSL 18.0.0 (OpenSSL 1.1.0i  14 Aug 2018), cryptography 2.3.1, Platform Linux-4.18.0-10-generic-x86_64-with-Ubuntu-18.10-cosmic
Traceback (most recent call last):
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 69, in load
return self._spiders[spider_name]
KeyError: 'datos'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gonzalo/.local/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/cmdline.py", line 150, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/cmdline.py", line 90, in _run_print_help
func(*a, **kw)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/cmdline.py", line 157, in _run_command
cmd.run(args, opts)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/commands/crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 170, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 198, in create_crawler
return self._create_crawler(crawler_or_spidercls)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 202, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
  File "/home/gonzalo/.local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 71, in load
    raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: datos'

If anyone could give me a help I would be very grateful. Since this mistake has me unable to move forward and finish the problem.
I leave the codes of my Spider, Item and Pipeline:

Spider.py scrapy import from datos.items import *

class DatosSpider(scrapy.Spider):
    name = 'lanacion'
    allowed_domains = ['lanacion.com.ar']
    start_urls = ['http://www.lanacion.com.ar/economia/divisas']

def parse(self, response):
    ##print(response.xpath('//*[@id="acumulado"]/section[3]/section[1]/div/div/div').extract())
    divisas = response.xpath('//*[@id="acumulado"]/section[3]/section[1]/div/div/div')

    items = []

    for divisa in divisas:
        item = DatosItems()
        print (divisa.xpath('label[1]').extract())
        item['nombre']  = divisa.xpath('label[1]/text()').extract()
        item['ultimo'] = divisa.xpath('label[2]/b/text()').extract()
        item['anterior'] = divisa.xpath('label[3]/text()').extract()
        item['variacion'] = divisa.xpath('label[4]/text()').extract()
        item['fechahora'] = divisa.xpath('label[5]/text()').extract()

        items.append(item)


    ##pass

    return items

Item.py

import scrapy


class DatosItems(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    # Info de cotización
    descripcion = scrapy.Field()
    ultimo = scrapy.Field()
    anterior = scrapy.Field()
    variacion = scrapy.Field()
    fechahora = scrapy.Field()

pass

Pipelines.py

import scrapy
from scrapy import signals
from scrapy.exporters import CsvItemExporter
import csv

class DatosPipeline(object):
    def __init__(self):
        self.file ={}

@classmethod
def from_crawler(cls, crawler):
    pipeline = cls()
    crawler.signals.connect(pipeline.spider_opened, signals.spider_opened)
    crawler.signals.connect(pipeline.spider_closed, signals.spider_closed)
    return pipeline
def spider_opened(self, spider):
    file = open('%s_items.csv' % spider.name, 'w+b')
    self.file[spider] = file
    self.exporter = CsvItemExporter(file)
    self.exporter.fields_to_export = ['descripcion', 'ultimo', 'anterior', 'variacion', 'fechahora']
    self.exporter.start_exporting()

def spider_closed(self, spider):
    self.exporter.finish_exporting()
    file = self.file.pop(spider)
    file.close()

def process_item(self, item, spider):
    self.exporter.expoty_item(item)
    return item

Thank you in advance.

python python-3.x ubuntu scrapy

asked by Gonzalo Ortiz 29.10.2018 в 17:46

source

1 answer

Error editing a visual studio project in another pc I can not concatenate this

score 0 · Answer 1

The reason he does not find the spider is because that is not the name of the spider, the instruction to run the spider is:

scrapy crawl Nombre_del_spider

If the spider is not found it will throw error.

That spider is not going to work the way it is running.

scrapy crawl datos -t csv

The proper way is to do it as specified in the answer to your previous question Previous reply

scrapy crawl lanacion -o file.csv -t csv

The reason is that the spider is not called Data, the name is lanacion, git link Clonar Spider