How to collect all the image tags from an html page on RUBY


My program collects all redirection addresses but I could not do the same with the image tags

    require 'nokogiri'
    require 'net/http'

    pagina  = Net::HTTP.get(ARGV[0],ARGV[1])
    enlaces = Nokogiri::HTML(pagina).xpath('//a[@href]').map { |link| link['href'] }
    imagenes = Nokogiri::HTML(pagina).xpath('//img/src').map { |link| link['src'] }
    puts "Los enlaces son: "
    puts enlaces
    puts "Las imagenes son: "
    puts imagenes
asked by Felipe Olaya Ospina 06.11.2017 в 14:55

1 answer


If the page you process is normal HTML, then the problem is that you are trying to read the src tag inside the img tag, when what you want to read is the src attribute of the% tag img (something like what you do with the href of the a tags).

Then you just have to change to //img[@src] in this line of code:

imagenes = Nokogiri::HTML(pagina).xpath('//img[@src]').map { |link| link['src'] }
answered by 06.11.2017 / 15:20