When I do ruby htmlImg.rb https://example.com/index.html I get

-1

When I run my script with:

$ ruby htmlImg.rb https://ejemplo.com/index.html

I get this error:

 /usr/share/ruby/2.3.0/net/http.rb:479:in 'get_response': undefined method 'hostname' for "https://twitter.com/Yojiexo":String (NoMethodError)
        from /usr/share/ruby/2.3.0/net/http.rb:456:in 'get'
        from htmlImg.rb:4:in '<main>'

This the code:

require 'nokogiri'
require 'net/http'

pagina = Net::HTTP.get(ARGV[0],ARGV[1])
enlace = Nokogiri::HTML(pagina).xpath('//pag[@href]').map {|link| link['href']}
imagen = Nokogiri::HTML(pagina).xpath('//img[@src]').map {|link| link ['src']}
puts "Los enlaces son: " + enlace
puts "Las direcciones de imagenes son: " + imagen
    
asked by Yojiexo 06.11.2017 в 23:46
source

1 answer

0

Since you are passing the complete URI (ie link ) you need to convert it into object URI 1 so that the get method recognizes it; for example:

require 'nokogiri'
require 'net/http'

pagina = Net::HTTP.get(URI(ARGV[0]))
enlace = Nokogiri::HTML(pagina).xpath('//a[@href]').map { |link| link['href'] }
imagen = Nokogiri::HTML(pagina).xpath('//img[@src]').map { |link| link ['src'] }

puts "Los enlaces son: #{enlace}" 
puts "Las direcciones de imagenes son: #{imagen}"

If you realize, you are just passing an argument to your htmlImg.rb script, so ARGV will only have 1 value which will be in the position 0 (ie ARGV[0] ).

Additionally, you will notice some corrections:

  • I changed pag[@href] by a[@href] to be able to find the links.
  • Since enlace e imagen are fixes, you have to convert them to string , using #{..} automatically calls the to_s method.

1 Visit this link for more information about the class Net::HTTP and its method get .

    
answered by 07.11.2017 в 00:44